lucene实践 - 索引维护、多域查询、高亮显示

　　之前的博客搜索栏用的是 sql 模糊查询进行查找，最近学完lucene，要学以致用啊，就把sql搜索给替换下来吧

　　中间遇到一些问题，也是学过程中没有提到的，所以说，还是实践出真知啊。

lucene分开来讲的话，我感觉就是两大块:索引维护、搜索索引

索引维护包括：添加索引、删除索引、更新索引

public class BlogIndex {

// lucene 路径在 bean 里面配置

    private String lucenePath;public String getLucenePath() {

        return lucenePath;

    }

    public void setLucenePath(String lucenePath) {

        this.lucenePath = lucenePath;

    }

    /**

     * 获取对lucene的写入方法

     */

    private IndexWriter getWriter() throws Exception {

        Directory dir = FSDirectory.open(new File(lucenePath).toPath());

        IndexWriterConfig config = new IndexWriterConfig(new IKAnalyzer());

        IndexWriter indexWriter = new IndexWriter(dir,config);

        return indexWriter;

    }

    /**

     * 增加索引

     */

    public void addIndex(BlogCustom blog) throws Exception {

        IndexWriter indexWriter = getWriter();

        Document doc = new Document();

        doc.add(new StringField("id",String.valueOf(blog.getId()),Field.Store.YES));

        doc.add(new TextField("title",blog.getTitle(),Field.Store.YES));

        doc.add(new TextField("summary",blog.getSummary(),Field.Store.YES));

        doc.add(new TextField("keyWord",blog.getKeyWord(),Field.Store.YES));

        indexWriter.addDocument(doc);

        indexWriter.close();

    }

    /**

     * 更新索引

     */

    public void updateIndex(BlogCustom blog) throws Exception {

        IndexWriter indexWriter = getWriter();

        Document doc = new Document();

        doc.add(new StringField("id",blog.getId()+"",Field.Store.YES));

        doc.add(new TextField("title",blog.getTitle(),Field.Store.YES));

        doc.add(new TextField("summary",blog.getSummary(),Field.Store.YES));

        doc.add(new TextField("keyWord",blog.getKeyWord(),Field.Store.YES));

        indexWriter.updateDocument(new Term("id",String.valueOf(blog.getId())),doc);

        indexWriter.close();

    }

    /**

     * 删除索引

     */

    public void deleteIndex(String blogId) throws Exception {

        IndexWriter indexWriter = getWriter();

        indexWriter.deleteDocuments(new Term("id",blogId));

        indexWriter.close();

    }

搜索索引就比较复杂一点

    /**

     * 搜索索引

     */

    public List<BlogCustom> searchBlog(String q) throws Exception{

        //创建一个 Analyzer对象，IKAnalyzer 对象

        Analyzer analyzer = new IKAnalyzer();

        List<BlogCustom> blogList = new LinkedList<>();

        Directory dir = FSDirectory.open(new File(lucenePath).toPath());

        IndexReader indexReader = DirectoryReader.open(dir);

        IndexSearcher indexSearch = new IndexSearcher(indexReader);

        // 多域查询

        String[] fields = {"id","title","summary","keyWord"};

        // 表示多个条件之间的关系，SHOULD 只要一个域里面有满足我们的搜索的内容就行

        // 数组长度 = fields 长度

        BooleanClause.Occur[] clauses = { BooleanClause.Occur.SHOULD,BooleanClause.Occur.SHOULD,

                BooleanClause.Occur.SHOULD,BooleanClause.Occur.SHOULD };

        // 参数: 关键词、多域、条件之间的关系、中文分析器

        Query query = MultiFieldQueryParser.parse(q, fields, clauses, analyzer);

        // 查询结果,设置最多返回100条数据

        TopDocs topDocs = indexSearch.search(query, 100);

        // 高亮关键词

        // 高亮格式

        SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("<font style='color:red;'>","</font>");

        // 关键词查询出来的指定位置

        QueryScorer scorer = new QueryScorer(query);

        // 在关键词指定位置,加上设定的高亮格式

        Highlighter highlighter = new Highlighter(formatter,scorer);

        // 设置含有关键字文本块的大小

        highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));

        ScoreDoc[] scoreDocs = topDocs.scoreDocs;

        //遍历查询结果，放入blogList

        for(ScoreDoc scoreDoc : scoreDocs){

            // 取当前文档

            Document doc = indexSearch.doc(scoreDoc.doc);

            BlogCustom blog = new BlogCustom();

            // 取出关键词

            int id = Integer.parseInt(doc.get("id"));

            blog.setId(id);

            String title = doc.get("title");

            String summary = doc.get("summary");

            String keyWord = doc.get("keyWord");

            // 给不为空的关键词，加上高亮显示

            if(title!=null) {

                TokenStream tokenStream = analyzer.tokenStream("title", title);

                String hTitle = highlighter.getBestFragment(tokenStream, title);

                if(StringUtil.isEmpty(hTitle)) {

                    blog.setTitle(title);

                }else {

                    blog.setTitle(hTitle);

                }

            }

            if(summary!=null) {

                TokenStream tokenStream = analyzer.tokenStream("summary", summary);

                String hSummary = highlighter.getBestFragment(tokenStream, summary);

                if(StringUtil.isEmpty(hSummary)) {

                    blog.setSummary(summary);

                }else {

                    blog.setSummary(hSummary);

                }

            }

            if(keyWord!=null) {

                TokenStream tokenStream = analyzer.tokenStream("keyWord", keyWord);

                String hKeyWord = highlighter.getBestFragment(tokenStream, keyWord);

                if(StringUtil.isEmpty(hKeyWord)) {

                    blog.setKeyWord(keyWord);

                }else {

                    blog.setKeyWord(hKeyWord);

                }

            }

            blogList.add(blog);

        }

        return blogList;

    }

}

完成 !

lucene实践 - 索引维护、多域查询、高亮显示的更多相关文章

[Elasticsearch] 多字段搜索 (六) - 自定义_all字段，跨域查询及精确值字段
自定义_all字段在元数据:_all字段中,我们解释了特殊的_all字段会将其它所有字段中的值作为一个大字符串进行索引.尽管将所有字段的值作为一个字段进行索引并不是非常灵活.如果有一个自定义的_al ...
Lucene 的 Field 域和索引维护
一.Field 域 1.Field 属性 Field 是文档中的域,包括 Field 名和 Field 值两部分,一个文档可以包括多个 Field,Document 只是 Field 的一个承载体,F ...
Lucene实现索引和查询
0引言随着万维网的发展和大数据时代的到来,每天都有大量的数字化信息在生产.存储.传递和转化,如何从大量的信息中以一定的方式找到满足自己需求的信息,使之有序化并加以利用成为一大难题.全文检索技术是现如 ...
lucene&solr学习——索引维护
1.索引库的维护索引库删除 (1) 全删除第一步:先对文档进行分析 public IndexWriter getIndexWriter() throws Exception { // 第一步:创建 ...
lucene查询索引之Query子类查询——（七）
0.文档名字:(根据名字索引查询文档)
Lucene之索引库的维护：添加，删除，修改
索引添加 Field域属性分类添加文档的时候,我们文档当中包含多个域,那么域的类型是我们自定义的,上个案例使用的TextField域,那么这个域他会自动分词,然后存储我们要根据数据类型和数据的用途 ...
一步一步跟我学习lucene（18）---lucene索引时join和查询时join使用演示样例
了解sql的朋友都知道,我们在查询的时候能够採用join查询,即对有一定关联关系的对象进行联合查询来对多维的数据进行整理.这个联合查询的方式挺方便的.跟我们现实生活中的托人找关系类似,我们想要完毕一件 ...
01 lucene基础北风网项目培训 Lucene实践课程索引
在创建索引的过程中IndexWriter会创建多个对应的Segment,这个Segment就是对应一个实体的索引段.随着索引的创建,Segment会慢慢的变大.为了提高索引的效率,IndexWrite ...
SolrJ 复杂查询高亮显示
SolrJ 复杂查询高亮显示上一章搭建了Solr服务器和导入了商品数据,本章通过SolrJ去学习Solr在企业中的运用.笔者最先是通过公司的云客服系统接触的Solr,几百万的留言秒秒钟就查询并高亮 ...

随机推荐

iOS中统计平台的使用
iOS腾讯Bugly使用 https://www.jianshu.com/p/f672e0d202ef iOS 百度统计的使用技巧 https://blog.csdn.net/yy1992320/a ...
NB-IoT的介绍最终版！看明白了吗？(转自 top-iot)
标签: NB-IOT 1 1G-2G-3G-4G-5G 不解释,看图,看看NB-IoT在哪里? 2 NB-IoT标准化历程 3GPP NB-IoT的标准化始于2015年9月,于2016年7月R13 ...
C++代码书写规范——给新手程序员的一些建议
代码就是程序员的面子,无论是在工作中在电脑上写程序代码还是在面试时在纸上写演示代码我们都希望写出整洁,优雅的代码.特别在工作中当我们碰到需要维护别人的代码,或者是多人参与一个项目大家一起写代码的时候, ...
Nmap 使用
0×01 前言因为今天的重点并非nmap本身的使用,主要还是想借这次机会给大家介绍一些在实战中相对比较实用的nmap脚本,所以关于nmap自身的一些基础选项就不多说了,详情可参考博客端口渗透相关文章 ...
SpringBoot之日志记录-专题四
SpringBoot之日志记录-专题四六.日志管理 6.1使用log4j记录日志 6.1.2新建log4j配置文件文件名称log4j.properties #log4j.rootLogger=CO ...
MongoDB基础篇1：安装和服务配置
一.下载请前往官网下载community版本MongoDB,我当前可见最新版本是3.6.4 https://www.mongodb.com/download-center#community 如需下 ...
[经验] Linux 怎么连接 Xshell?
(1) 首先, 你要先有一个可以正常运行的 Linux 系统, 当然一般情况下我们是把这个系统放在虚拟机里的, 我所使用的是 ubuntu-18.04.2-live-server-amd64.iso ...
JS操作网页中的iframe
/* *parent.html */ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" " ...
C语言调试器GDB和LLDB的使用方法
调试器的使用编译输出带调试信息的程序调试信息包含:指令地址.对应源代码及行号指令完成后,回调 LINUX使用GDB MAX使用LLDB 使用说明 // 开始调试testlib程序 lldb te ...
[Linux] day05——命令行
--------------------linux命令实现某一功能指令或程序命令行执行依赖于解释器linux命令的分类内部命令属于shell解释器一部分 /bin/bash 外部命令独立与s ...

lucene实践 - 索引维护、多域查询、高亮显示

索引维护包括：添加索引、删除索引、更新索引

搜索索引就比较复杂一点

lucene实践 - 索引维护、多域查询、高亮显示的更多相关文章

随机推荐

热门专题