全文检索Lucene (2)

接着全文检索Lucene (1) 。下面我们来深入的研究一下，如何使用Lucene！

从全文检索Lucene (1)中我们可以看出，Lucene就好比一个双向的工作流，一方面是对索引库的维护，另一方面是对查询过程的支持。同时，这也是Lucene的优雅所在。

Lucene索引库构建分析

Lucene查询过程分析

范例分析

下面我会写一个小的demo，大致的功能就是CRUD。类比JDBC，我们不可避免的要写一些工具类来优化我们的代码，减少重复代码。

`Article.java`

/**
 * @Date 2016年8月1日
 *
 * @author Administrator
 */
package domain;

/**
 * @author 郭瑞彪
 *
 */
public class Article {

    private Integer id;
    private String title;
    private String content;

    public Integer getId() {
        return id;
    }

    public void setId(Integer id) {
        this.id = id;
    }

    public String getTitle() {
        return title;
    }

    @Override
    public String toString() {
        return "Article [id=" + id + ", title=" + title + ", content=" + content + "]";
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getContent() {
        return content;
    }

    public void setContent(String content) {
        this.content = content;
    }

}

`LuceneUtils.java`

/**
 * @Date 2016年8月1日
 *
 * @author Administrator
 */
package util;

import java.io.IOException;
import java.nio.file.Paths;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

/**
 * @author 郭瑞彪
 *
 */
public class LuceneUtils {

    private static Directory dir;

    public static Directory getDir() {
        return dir;
    }

    public static Analyzer getAnalyzer() {
        return analyzer;
    }

    private static Analyzer analyzer;

    /**
     * 获得一个用于操作索引库的IndexWriter对象
     *
     * @return
     */
    public static IndexWriter getIndexWriter() {
        try {
            Directory dir = FSDirectory.open(Paths.get("./indexDir/"));
            IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new StandardAnalyzer());
            IndexWriter indexWriter = new IndexWriter(dir, indexWriterConfig);
            return indexWriter != null ? indexWriter : null;
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return null;
    }

    /**
     * 获得一个索引库查询对象
     *
     * @return
     */
    public static IndexSearcher getIndexSearcher() {

        try {
            DirectoryReader directoryReader = DirectoryReader.open(FSDirectory.open(Paths.get("./indexDir/")));
            IndexReader indexReader = directoryReader;
            IndexSearcher indexSearcher = new IndexSearcher(indexReader);
            return indexSearcher != null ? indexSearcher : null;
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return null;
    }

    /**
     * 释放IndexWriter资源
     *
     * @param indexWriter
     */
    public static void closeIndexWriter(IndexWriter indexWriter) {
        try {
            if (indexWriter != null) {
                indexWriter.close();
                indexWriter = null;
            }
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

    /**
     * 释放IndexSearcher资源
     *
     * @param indexSearcher
     */
    public static void closeIndexSearcher(IndexSearcher indexSearcher) {
        try {
            if (indexSearcher != null) {
                indexSearcher = null;
            }
        } catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
}

`ArticleDocument.java`

/**
 * @Date 2016年8月1日
 *
 * @author Administrator
 */
package util;

import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Field.Store;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;

import domain.Article;

/**
 * @author 郭瑞彪
 *
 */
public class ArticleDocumentUtils {

    /**
     * Article转换成Licene的Document
     *
     * @param article
     * @return
     */
    public static Document article2Document(Article article) {
        Document doc = new Document();
        doc.add(new StringField("id", article.getId().toString(), Store.YES));
        doc.add(new TextField("title", article.getTitle(), Store.YES));
        doc.add(new TextField("content", article.getContent(), Store.YES));

        return doc != null ? doc : null;
    }

    /**
     * 将Document转换回Article
     *
     * @param document
     * @return
     */
    public static Article document2Article(Document document) {
        Article a = new Article();
        a.setId(Integer.parseInt(document.get("id")));
        a.setTitle(document.get("title"));
        a.setContent(document.get("content"));

        return a != null ? a : null;
    }

}

`ArticleIndexDao.java`

/**
 * @Date 2016年8月1日
 *
 * @author Administrator
 */
package dao;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryparser.classic.MultiFieldQueryParser;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.util.Version;

import domain.Article;
import domain.Page;
import util.ArticleDocumentUtils;
import util.LuceneUtils;

/**
 * @author 郭瑞彪
 *
 */
public class ArticleIndexDao {

    /**
     * 将新的数据保存到索引库
     *
     * @param article
     */
    public void save(Article article) {
        IndexWriter indexWriter = null;
        try {
            // 1. article --> document
            Document doc = ArticleDocumentUtils.article2Document(article);
            // 2. indexWriter.addDocument(document)
            indexWriter = LuceneUtils.getIndexWriter();
            indexWriter.addDocument(doc);
            // 临时代码
            indexWriter.close();
        } catch (IOException e) {
            throw new RuntimeException("ArticleIndexDao--save方法出错！\n" + e);
        } finally {
            LuceneUtils.closeIndexWriter(indexWriter);
        }
    }

    /**
     * 删除索引
     *
     * @param id
     */
    public void delete(Integer id) {
        IndexWriter indexWriter = null;
        try {
            indexWriter = LuceneUtils.getIndexWriter();
            indexWriter.deleteDocuments(new Term("id", id.toString()));
        } catch (IOException e) {
            throw new RuntimeException("ArticleIndexDao--save方法出错！\n" + e);
        } finally {
            LuceneUtils.closeIndexWriter(indexWriter);
        }
    }

    /**
     * 更新索引 <br>
     * 更新操作代价很高，一般采取先删除对应的索引，然后在创建这个索引的方式
     *
     * @param article
     */
    public void update(Article article) {
        IndexWriter indexWriter = null;
        try {
            Term term = new Term("id", article.getId().toString());
            indexWriter = LuceneUtils.getIndexWriter();
            // Document doc = new Document();
            // doc.add(new TextField("title", article.getTitle(), Store.YES));
            // doc.add(new TextField("content", article.getContent(),
            // Store.YES));
            // indexWriter.updateDocument(new Term("title", "content"), doc);

            // 优化版本的实现就是：先删除愿索引，然后再创建该索引
            indexWriter.deleteDocuments(term);
            Document doc = ArticleDocumentUtils.article2Document(article);
            indexWriter.addDocument(doc);
        } catch (IOException e) {
            throw new RuntimeException("ArticleIndexDao--save方法出错！\n" + e);
        } finally {
            LuceneUtils.closeIndexWriter(indexWriter);
        }
    }

    /**
     * 从索引库中查询
     *
     * @param queryString
     *            查询字符串
     * @return
     */
    public List<Article> search(String queryString) {
        try {
            // 1.queryString -->>Query
            String[] queryFields = new String[] { "title", "content" };
            Analyzer analyzer = new StandardAnalyzer();
            analyzer.setVersion(Version.LUCENE_6_0_0.LUCENE_6_1_0);
            QueryParser queryParser = new MultiFieldQueryParser(queryFields, analyzer);
            Query query = queryParser.parse(queryString);
            // 2. 查询，得到topDocs
            IndexSearcher indexSearcher = LuceneUtils.getIndexSearcher();
            TopDocs topDocs = indexSearcher.search(query, 100);
            // 3.处理结果并返回
            int totalHits = topDocs.totalHits;
            ScoreDoc[] scoreDocs = topDocs.scoreDocs;
            List<Article> articles = new ArrayList<Article>();
            for (int i = 0; i < scoreDocs.length; i++) {
                ScoreDoc scoreDoc = scoreDocs[i];
                Document doc = indexSearcher.doc(scoreDoc.doc);
                Article a = ArticleDocumentUtils.document2Article(doc);
                articles.add(a);
            }
            LuceneUtils.closeIndexSearcher(indexSearcher);
            return articles.size() > 0 ? articles : null;
        } catch (Exception e) {
            throw new RuntimeException("ArticleIndexDao-->> search方法出错！\n" + e);
        }
    }

}

核心操作

关键点：我们在update索引库的时候会花费很大的代价。官网上也建议“先删掉相关索引项，然后在新建这个索引项”。注意Term的使用格式即可。

总结

经JUnit测试，代码可以正常的通过。

当然，代码中可以进行优化的地方还有很多，但是作为演示来说还是差强人意的吧。

希望对于Lucene6.1.0版本有困难的小伙伴能从中收获到自己需要的内容。

全文检索Lucene (2)的更多相关文章

全文检索 Lucene(4)
经过了前面几篇文章的学习,我们基本上可以适用Lucene来开发我们的站内搜索应用了.但是观察一下目前的主流的搜索引擎,我们会发现查询结果会有高亮的显示效果.所以,今天我们就来学习一下,给Lucene添 ...
全文检索 Lucene(3)
看完前两篇博客之后,想必大家对于Lucene的使用都有了一个比较清晰的认识了.如果对Lucene的知识点还是有点模糊的话,个人建议还是先看看这两篇文章. 全文检索 Lucene(1) 全文检索 Luc ...
Lucene 全文检索 Lucene的使用
Lucene 全文检索 Lucene的使用一.简介: 参考百度百科: http://baike.baidu.com/link?url=eBcEVuUL3TbUivRvtgRnMr1s44nTE7 ...
全文检索--Lucene & ElasticSearch
全文检索--Lucene 2.1 全文检索和以前高级查询的比较 1.高级查询缺点:1.like让数据库索引失效 2.每次查询都是查询数据库 ,如果访问的人比较多,压力也是比较大 2.全文检索框架:A ...
[全文检索]Lucene基础入门.
本打算直接来学习Solr, 现在先把Lucene的只是捋一遍. 本文内容: 1. 搜索引擎的发展史 2. Lucene入门 3. Lucene的API详解 4. 索引调优 5. Lucene搜索结果排 ...
全文检索Lucene (1)
Lucene是apache开源的一个全文检索框架,很是出名.今天先来分享一个类似于HelloWorld级别的使用. 工作流程依赖我们要想使用Lucene,那就得先引用人家的jar包了.下面列举一下 ...
全文检索-Lucene.net
Lucene.net是Lucene的.net移植版本,在较早之前是比较受欢迎的一个开源的全文检索引擎开发包,即它不是一个完整的全文检索引擎,而是一个全文检索引擎的架构,提供了完整的查询引擎和索引引擎. ...
全文检索Lucene框架---查询索引
一. Lucene索引库查询对要搜索的信息创建Query查询对象,Lucene会根据Query查询对象生成最终的查询语法,类似关系数据库Sql语法一样Lucene也有自己的查询语法,比如:“name ...
]NET Core Lucene.net和PanGu分词实现全文检索
Lucene.net和PanGu分词实现全文检索 Lucene.net(4.8.0) 学习问题记录五: JIEba分词和Lucene的结合,以及对分词器的思考前言:目前自己在做使用Lucene. ...

随机推荐

hadoop一键安装伪分布式
hadoop伪分布式和hive在openSUSE中的安装在git上的路径为:https://github.com/huabingood/hadoop--------/tree/master 各个文件 ...
codevs 搜索题汇总（黄金级）
2801 LOL-盖伦的蹲草计划时间限制: 1 s 空间限制: 256000 KB 题目等级 : 黄金 Gold 题目描述 Description 众所周知,LOL这款伟大的游戏,有个叫盖 ...
【NOIP2017 OFO（下）】
·我不知道对不对,只是不想让大米兔就这样离开. by tkys_Austin; [另一只情绪化的兔子] 今年的11月12日NOIP提高组, ...
Python中def及lambda的功能介绍
函数def及lambda的功能介绍 1. def函数的功能介绍 1. 函数的参数无参数函数格式:def func_name(): '''__doc__'''#函数的说明文档(内容) express ...
TensorFlow-Slim image classification library：TensorFlow-Slim 图像分类库
TensorFlow-Slim 图像分类库 TF-slim是用于定义,训练和评估复杂模型的TensorFlow(tensorflow.contrib.slim)的新型轻量级高级API. 该目录包含用于 ...
ionic3-ng4学习见闻--(自定义ion-tab图标)
学习混合开发语言,目的就是为了快速开发一个适用于多平台的app. app基本都会有footer,也就是tabbar,用来快速导航不同的页面. ionic也有这个组件,ion-tab. 常用方法如下: ...
安装Leanote极客范的云笔记
前言在这个互联网知识呈爆炸增长的时代,作为一个程序员要掌握的知识越来越多,然再好的记性也不如烂笔头,有了笔记我们就是可以时常扒拉扒拉以前的知识,顺便可以整理下自己的知识体系. 如今市面上云笔记产品, ...
Python小代码_5_二维矩阵转置
使用列表推导式实现二维矩阵转置 matrix = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]] print(matrix) matrix_t = [[ro ...
0. 迷之 -> 和 .
0. 迷之 -> 和 . 箭头(->):左边必须为指针: 点号(.):左边必须为实体. e.g.1 class class A{ public: play(); }; int main() ...
新手级配置 react react-router4.0 redux fetch sass
前言最近公司来了几个实习生,刚好我手头没什么要紧事,然后领导让我带他们学习react, 为下一个react项目做基础. 然后随手写了几个demo,帮助他们了解正经项目如何去构建配置项目. 现在分享出 ...

全文检索Lucene (2)

Lucene索引库构建分析

Lucene查询过程分析

范例分析

Article.java

LuceneUtils.java

ArticleDocument.java

ArticleIndexDao.java