Lucene学习笔记1(V7.1)

Lucene是一个搜索类库,solr、nutch和elasticsearch都是基于Lucene。个人感觉学习高级搜索引擎应用程序之前有必要了解Lucene。

开发环境：idea maven springboot

开始贴代码：

maven配置

 <parent>

        <groupId>org.springframework.boot</groupId>

        <artifactId>spring-boot-starter-parent</artifactId>

        <version>1.4..RELEASE</version>

    </parent>

    <properties>

        <java.version>1.8</java.version>

    </properties>

    <dependencies>

        <dependency>

            <groupId>org.springframework.boot</groupId>

            <artifactId>spring-boot-starter</artifactId>

        </dependency>

        <dependency>

            <groupId>org.springframework.boot</groupId>

            <artifactId>spring-boot-starter-thymeleaf</artifactId>

        </dependency>

        <!-- hot swapping, disable cache for template, enable live reload -->

        <dependency>

            <groupId>org.springframework.boot</groupId>

            <artifactId>spring-boot-devtools</artifactId>

            <optional>true</optional>

        </dependency>

            <!--Lucene-->

            <dependency>

                <groupId>org.apache.lucene</groupId>

                <artifactId>lucene-core</artifactId>

                <version>7.1.</version>

            </dependency>

            <!--中文分词器,一般分词器适用于英文分词(common)-->

            <dependency>

                <groupId>org.apache.lucene</groupId>

                <artifactId>lucene-analyzers-smartcn</artifactId>

                <version>7.1.</version>

            </dependency>

            <dependency>

                <groupId>org.apache.lucene</groupId>

                <artifactId>lucene-queryparser</artifactId>

                <version>7.1.</version>

            </dependency>

            <!--检索关键字高亮显示-->

            <dependency>

                <groupId>org.apache.lucene</groupId>

                <artifactId>lucene-highlighter</artifactId>

                <version>7.1.</version>

            </dependency>

            <!--Lucene-->

            <dependency>

                <groupId>junit</groupId>

                <artifactId>junit</artifactId>

                <version>4.12</version>

            </dependency>

    </dependencies>

    <build>

        <plugins>

            <!-- Package as an executable jar/war -->

            <plugin>

                <groupId>org.springframework.boot</groupId>

                <artifactId>spring-boot-maven-plugin</artifactId>

            </plugin>

        </plugins>

    </build>

辅助类

public class LuceneConstants {

    public static final String CONTENTS="contents";

    public static final String FILE_NAME="filename";

    public static final String FILE_PATH="filepath";

    public static final int MAX_SEARCH = ;

    public  static final String IndexDir ="E:\\Lucene\\Index";

    public  static final String DataDir ="E:\\Lucene\\Data";

    public  static final String ArticleDir ="E:\\Lucene\\Files\\article.txt";

}

调用Lucene

public class Indexer {

    public void addEntity() throws IOException {

        Article article = new Article();

        //article.setId(1);

        //article.setTitle("Lucene全文检索");

        //article.setContent("Lucene是apache软件基金会4 jakarta项目组的一个子项目，是一个开放源代码的全文检索引擎工具包，但它不是一个完整的全文检索引擎，而是一个全文检索引擎的架构，提供了完整的查询引擎和索引引擎，部分文本分析引擎（英文与德文两种西方语言）。");

        article.setId();

        article.setTitle("Solr搜索引擎");

        article.setContent("Solr是基于Lucene框架的搜索莹莹程序，是一个开放源代码的全文检索引擎。");

        final Path path = Paths.get(LuceneConstants.IndexDir);

        Directory directory = FSDirectory.open(path);//索引存放目录 存在磁盘

        //Directory RAMDirectory= new RAMDirectory();// 存在内存

        Analyzer analyzer = new StandardAnalyzer();

        IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);

        //indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);

        indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.APPEND);

        IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);//更新或创建索引

        Document document = new Document();

        document.add(new TextField("id", article.getId().toString(), Field.Store.YES));

        document.add(new TextField("title", article.getTitle(), Field.Store.YES));

        document.add(new TextField("content", article.getContent(), Field.Store.YES));

        indexWriter.addDocument(document);

        indexWriter.close();

    }

    public void addFile() throws IOException {

        final Path path = Paths.get(LuceneConstants.IndexDir);

        Directory directory = FSDirectory.open(path);

        Analyzer analyzer=new StandardAnalyzer();

        IndexWriterConfig indexWriterConfig=new IndexWriterConfig(analyzer);

        indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);

        IndexWriter indexWriter=new IndexWriter(directory,indexWriterConfig);

        InputStreamReader isr = new InputStreamReader(new FileInputStream(LuceneConstants.ArticleDir), "GBK");//.txt文档,不设置格式会乱码

        BufferedReader bufferedReader=new BufferedReader(isr);

        String content="";

        while ((content=bufferedReader.readLine())!=null){

            Document document=new Document();

            document.add(new TextField("content",content,Field.Store.YES) );

            indexWriter.addDocument(document);

        }

        bufferedReader.close();

        indexWriter.close();

    }

    public List<String> SearchFiles() throws IOException, ParseException {

        String queryString = "Solr";

        final Path path = Paths.get(LuceneConstants.IndexDir);

        Directory directory = FSDirectory.open(path);//索引存储位置

        Analyzer analyzer = new StandardAnalyzer();//分析器

        //单条件

        //关键词解析

        //QueryParser queryParser=new QueryParser("content",analyzer);

        //Query query=queryParser.parse(queryString);

        //多条件

        Query mQuery = MultiFieldQueryParser.parse(new String[]{"Solr"},new String[]{"content"},new StandardAnalyzer());

        IndexReader indexReader = DirectoryReader.open(directory);//索引阅读器

        IndexSearcher indexSearcher = new IndexSearcher(indexReader);//查询

        //TopDocs topDocs=indexSearcher.search(query,3);

        TopDocs topDocs=indexSearcher.search(mQuery,);

        long count = topDocs.totalHits;

        ScoreDoc[] scoreDocs = topDocs.scoreDocs;

        List<String> list=new ArrayList<String>();

        list.add(String.valueOf(count));

        Integer cnt=;

        for (ScoreDoc scoreDoc : scoreDocs) {

            Document document = indexSearcher.doc(scoreDoc.doc);

            //list.add(cnt.toString()+"-"+"相关度："+scoreDoc.score+"-----time:"+document.get("time"));

            //list.add("|||");

            //list.add(cnt.toString()+"-"+document.get("content"));

            list.add(document.get("content"));

            cnt++;

        }

        return  list;

    }

}

查看运行效果

@Controller

public class LuceneController {

    @RequestMapping("/add")

    public String welcomepage(Map<String, Object> model) {

        try {

            Indexer indexer = new Indexer();

            indexer.addEntity();

            model.put("message", "Success");

        } catch (IOException ex) {

            model.put("message", "Failure");

        }

        return "welcome";

    }

    @RequestMapping("/file")

    public String fileindex(Map<String, Object> model) {

        try {

            Indexer indexer = new Indexer();

            indexer.addFile();

            model.put("message", "SuccessF");

        } catch (IOException ex) {

            model.put("message", "FailureF");

        }

        return "welcome";

    }

    @RequestMapping("/search")

    public String searchindex(Map<String, Object> model) {

        try {

            Indexer indexer = new Indexer();

            List<String> rlts = indexer.SearchFiles();

            String message = "";

            for (String str : rlts) {

                message += str + " ";

            }

            model.put("message", message);

        } catch (Exception ex) {

            model.put("message", "FailureF");

        }

        return "welcome";

    }

}

Lucene学习笔记1(V7.1)的更多相关文章

Lucene学习笔记（更新）
1.Lucene学习笔记 http://www.cnblogs.com/hanganglin/articles/3453415.html
Lucene学习笔记2-Lucene的CRUD(V7.1)
在进行CRUD的时候请注意IndexWriterConfig的设置. public class IndexCRUD { "}; private String citys[]={"j ...
Apache Lucene学习笔记
Hadoop概述 Apache lucene: 全球第一个开源的全文检索引擎工具包完整的查询引擎和搜索引擎部分文本分析引擎开发人员在此基础建立完整的全文检索引擎以下为转载:http://www ...
Lucene学习笔记
师兄推荐我学习Lucene这门技术,用了两天时间,大概整理了一下相关知识点. 一.什么是Lucene Lucene即全文检索.全文检索是计算机程序通过扫描文章中的每一个词,对每一个词建立一个索引,指明 ...
Lucene学习笔记：四，Lucene索引过程分析
对于Lucene的索引过程,除了将词(Term)写入倒排表并最终写入Lucene的索引文件外,还包括分词(Analyzer)和合并段(merge segments)的过程,本次不包括这两部分,将在以后 ...
Solr学习笔记1(V7.2)
下载压缩包http://archive.apache.org/dist/lucene/,解压后放到某一盘符下面 Windows下启动命令 :\solr-7.2.0>bin\solr.cmd st ...
Lucene学习笔记：基础
Lucence是Apache的一个全文检索引擎工具包.可以将采集的数据存储到索引库中,然后在根据查询条件从索引库中取出结果.索引库可以存在内存中或者存在硬盘上. 本文主要是参考了这篇博客进行学习的,原 ...
Lucene学习笔记：五，Lucene搜索过程解析
一.Lucene搜索过程总论搜索的过程总的来说就是将词典及倒排表信息从索引中读出来,根据用户输入的查询语句合并倒排表,得到结果文档集并对文档进行打分的过程. 其可用如下图示: 总共包括以下几个过程: ...
lucene学习笔记：三，Lucene的索引文件格式
Lucene的索引里面存了些什么,如何存放的,也即Lucene的索引文件格式,是读懂Lucene源代码的一把钥匙. 当我们真正进入到Lucene源代码之中的时候,我们会发现: Lucene的索引过程, ...

随机推荐

Linux(CentOS6.5)下编译安装Nginx1.10.1
首先在特权账号(root)下安装编译时依赖项: yum install gcc gcc-c++ perl -y 首先以非特权账号(本文以账号comex为例)登陆OS: 进入data目录下载相关安装 ...
[编织消息框架][netty源码分析]9 Promise 实现类DefaultPromise职责与实现
netty Future是基于jdk Future扩展,以监听完成任务触发执行Promise是对Future修改任务数据DefaultPromise是重要的模板类,其它不同类型实现基本是一层简单的包装 ...
一点解决版本冲突的应急思路、怎样在所有jar包文件中搜索冲突的方法？
maven是一个很好的项目管理工具,你可以轻松的定义一个引用,从而达到使用别人写好的库的作用.且maven可以轻松地和jenkins配合,从而使打包部署变得更容易. 但是也因为这样,我们变得更傻瓜了, ...
Android Studio移动鼠标显示悬浮提示的设置方法
欢迎和大家交流技术相关问题: 邮箱: jiangxinnju@163.com 博客园地址: http://www.cnblogs.com/jiangxinnju GitHub地址: https://g ...
基于MongoDb官方C#驱动封装MongoDbCsharpHelper类（CRUD类）
近期工作中有使用到 MongoDb作为日志持久化对象,需要实现对MongoDb的增.删.改.查,但由于MongoDb的版本比较新,是2.4以上版本的,网上已有的一些MongoDb Helper类都是基 ...
为什么树莓派不会受到 Spectre 和 Meltdown 攻击
最近爆出来的 Intel CPU 的底层漏洞可谓是影响巨大,过去20年的电脑都可能会受影响.前几天 Raspberry Pi 的官方 Twitter(@Raspberry_Pi) 转推了这篇文章,通过 ...
BinderPool — Andorid端的“服务发现治理工具”
导语最近在学习微服务相关知识,突然想到:微服务的思想虽然是在server端的场景下提出来的,但是无论是server,还是移动端,思想是相通的,移动端也会有多服务的场景,就同样面临多服务需要整合治理的 ...
某xss挑战赛闯关笔记
0x0 前言在sec-news发现先知上师傅monika发了一个xss挑战赛的闯关wp([巨人肩膀上的矮子]XSS挑战之旅---游戏通关攻略(更新至18关)https://xianzhi.aliyu ...
K：图相关的最小生成树（MST）
相关介绍: 根据树的特性可知,连通图的生成树是图的极小连通子图,它包含图中的全部顶点,但只有构成一棵树的边:生成树又是图的极大无回路子图,它的边集是关联图中的所有顶点而又没有形成回路的边. 一个有 ...
python requirements使用方法
记得导入导出包的时候要想激活虚拟环境. 1.导出requirements方法 pip freeze > requirements.txt 2.安装requirements方法 pip insta ...

Lucene学习笔记1(V7.1)

Lucene学习笔记1(V7.1)的更多相关文章

随机推荐

热门专题