Lucene学习之CURD

创建索引

　　Lucene在进行创建索引时，根据前面一篇博客，已经讲完了大体的流程，这里再简单说下：

 Directory directory = FSDirectory.open("/tmp/testindex");

 IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_CURRENT, analyzer);

 IndexWriter iwriter = new IndexWriter(directory, config);

 Document doc = new Document();

 String text = "This is the text to be indexed.";

 doc.add(new Field("fieldname", text, TextField.TYPE_STORED)); iwriter.close();

1 创建Directory，获取索引目录

　　2 创建词法分析器，创建IndexWriter对象

　　3 创建document对象，存储数据

　　4 关闭IndexWriter，提交

 /**

      * 建立索引

      *

      * @param args

      */

     public static void index() throws Exception {

         String text1 = "hello,man!";

         String text2 = "goodbye,man!";

         String text3 = "hello,woman!";

         String text4 = "goodbye,woman!";

         Date date1 = new Date();

         analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);

         directory = FSDirectory.open(new File(INDEX_DIR));

         IndexWriterConfig config = new IndexWriterConfig(

                 Version.LUCENE_CURRENT, analyzer);

         indexWriter = new IndexWriter(directory, config);

         Document doc1 = new Document();

         doc1.add(new TextField("filename", "text1", Store.YES));

         doc1.add(new TextField("content", text1, Store.YES));

         indexWriter.addDocument(doc1);

         Document doc2 = new Document();

         doc2.add(new TextField("filename", "text2", Store.YES));

         doc2.add(new TextField("content", text2, Store.YES));

         indexWriter.addDocument(doc2);

         Document doc3 = new Document();

         doc3.add(new TextField("filename", "text3", Store.YES));

         doc3.add(new TextField("content", text3, Store.YES));

         indexWriter.addDocument(doc3);

         Document doc4 = new Document();

         doc4.add(new TextField("filename", "text4", Store.YES));

         doc4.add(new TextField("content", text4, Store.YES));

         indexWriter.addDocument(doc4);

         indexWriter.commit();

         indexWriter.close();

         Date date2 = new Date();

         System.out.println("创建索引耗时：" + (date2.getTime() - date1.getTime()) + "ms\n");

     }

增量添加索引

　　Lucene拥有增量添加索引的功能，在不会影响之前的索引情况下，添加索引，它会在何时的时机，自动合并索引文件。

 /**

      * 增加索引

      *

      * @throws Exception

      */

     public static void insert() throws Exception {

         String text5 = "hello,goodbye,man,woman";

         Date date1 = new Date();

         analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);

         directory = FSDirectory.open(new File(INDEX_DIR));

         IndexWriterConfig config = new IndexWriterConfig(

                 Version.LUCENE_CURRENT, analyzer);

         indexWriter = new IndexWriter(directory, config);

         Document doc1 = new Document();

         doc1.add(new TextField("filename", "text5", Store.YES));

         doc1.add(new TextField("content", text5, Store.YES));

         indexWriter.addDocument(doc1);

         indexWriter.commit();

         indexWriter.close();

         Date date2 = new Date();

         System.out.println("增加索引耗时：" + (date2.getTime() - date1.getTime()) + "ms\n");

     }

删除索引

　　Lucene也是通过IndexWriter调用它的delete方法，来删除索引。我们可以通过关键字，删除与这个关键字有关的所有内容。如果仅仅是想要删除一个文档，那么最好就顶一个唯一的ID域，通过这个ID域，来进行删除操作。

 /**

      * 删除索引

      *

      * @param str 删除的关键字

      * @throws Exception

      */

     public static void delete(String str) throws Exception {

         Date date1 = new Date();

         analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);

         directory = FSDirectory.open(new File(INDEX_DIR));

         IndexWriterConfig config = new IndexWriterConfig(

                 Version.LUCENE_CURRENT, analyzer);

         indexWriter = new IndexWriter(directory, config);

         indexWriter.deleteDocuments(new Term("filename",str));  

         indexWriter.close();

         Date date2 = new Date();

         System.out.println("删除索引耗时：" + (date2.getTime() - date1.getTime()) + "ms\n");

     }

更新索引

　　Lucene没有真正的更新操作，通过某个fieldname，可以更新这个域对应的索引，但是实质上，它是先删除索引，再重新建立的。

 /**

      * 更新索引

      *

      * @throws Exception

      */

     public static void update() throws Exception {

         String text1 = "update,hello,man!";

         Date date1 = new Date();

          analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);

          directory = FSDirectory.open(new File(INDEX_DIR));

          IndexWriterConfig config = new IndexWriterConfig(

                  Version.LUCENE_CURRENT, analyzer);

          indexWriter = new IndexWriter(directory, config);

          Document doc1 = new Document();

         doc1.add(new TextField("filename", "text1", Store.YES));

         doc1.add(new TextField("content", text1, Store.YES));

         indexWriter.updateDocument(new Term("filename","text1"), doc1);

          indexWriter.close();

          Date date2 = new Date();

          System.out.println("更新索引耗时：" + (date2.getTime() - date1.getTime()) + "ms\n");

     }

通过索引查询关键字

　　Lucene的查询方式有很多种，这里就不做详细介绍了。它会返回一个ScoreDoc的集合，类似ResultSet的集合，我们可以通过域名获取想要获取的内容。

 /**

      * 关键字查询

      *

      * @param str

      * @throws Exception

      */

     public static void search(String str) throws Exception {

         directory = FSDirectory.open(new File(INDEX_DIR));

         analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);

         DirectoryReader ireader = DirectoryReader.open(directory);

         IndexSearcher isearcher = new IndexSearcher(ireader);

         QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content",analyzer);

         Query query = parser.parse(str);

         ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;

         for (int i = 0; i < hits.length; i++) {

             Document hitDoc = isearcher.doc(hits[i].doc);

             System.out.println(hitDoc.get("filename"));

             System.out.println(hitDoc.get("content"));

         }

         ireader.close();

         directory.close();

     }

Lucene学习之CURD的更多相关文章

Lucene学习笔记（更新）
1.Lucene学习笔记 http://www.cnblogs.com/hanganglin/articles/3453415.html
Lucene学习总结之七：Lucene搜索过程解析
一.Lucene搜索过程总论搜索的过程总的来说就是将词典及倒排表信息从索引中读出来,根据用户输入的查询语句合并倒排表,得到结果文档集并对文档进行打分的过程. 其可用如下图示: 总共包括以下几个过程: ...
Lucene学习总结之六：Lucene打分公式的数学推导
在进行Lucene的搜索过程解析之前,有必要单独的一张把Lucene score公式的推导,各部分的意义阐述一下.因为Lucene的搜索过程,很重要的一个步骤就是逐步的计算各部分的分数. Lucene ...
Lucene学习-深入Lucene分词器,TokenStream获取分词详细信息
Lucene学习-深入Lucene分词器,TokenStream获取分词详细信息在此回复牛妞的关于程序中分词器的问题,其实可以直接很简单的在词库中配置就好了,Lucene中分词的所有信息我们都可以从 ...
Lucene学习入门——下载初识
本文从官网下载Lucene开始,一步一步进行Lucene的应用学习研究.下载初识Snowball Stemmer 1.下载 (1)首先,去Lucne的Apache官网主页 http://lucene. ...
Lucene学习总结之七：Lucene搜索过程解析 2014-06-25 14:23 863人阅读评论(1) 收藏
一.Lucene搜索过程总论搜索的过程总的来说就是将词典及倒排表信息从索引中读出来,根据用户输入的查询语句合并倒排表,得到结果文档集并对文档进行打分的过程. 其可用如下图示: 总共包括以下几个过程: ...
Lucene学习总结之六：Lucene打分公式的数学推导 2014-06-25 14:20 384人阅读评论(0) 收藏
在进行Lucene的搜索过程解析之前,有必要单独的一张把Lucene score公式的推导,各部分的意义阐述一下.因为Lucene的搜索过程,很重要的一个步骤就是逐步的计算各部分的分数. Lucene ...
Lucene学习笔记
师兄推荐我学习Lucene这门技术,用了两天时间,大概整理了一下相关知识点. 一.什么是Lucene Lucene即全文检索.全文检索是计算机程序通过扫描文章中的每一个词,对每一个词建立一个索引,指明 ...
Apache Lucene学习笔记
Hadoop概述 Apache lucene: 全球第一个开源的全文检索引擎工具包完整的查询引擎和搜索引擎部分文本分析引擎开发人员在此基础建立完整的全文检索引擎以下为转载:http://www ...

随机推荐

oracle spoof用法
关于SPOOL(SPOOL是SQLPLUS的命令,不是SQL语法里面的东西.) 对于SPOOL数据的SQL,最好要自己定义格式,以方便程序直接导入,SQL语句如: select taskindex|| ...
Jq超链接提示
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/ ...
表设计与SQL优化
1. 说说分区表的主要好处是什么,为什么会有这些好处. 分区功能能够将表.索引或索引组织表进一步细分为段,这些数据库对象的段叫做分区.每个分区有自己的名称,还可以选择自己的存储特性. 从数据库管理员的 ...
Chrome和Firefox浏览器调试对比
最近的项目中使用Extjs5, 其中主要的一个特点就是js文件的动态加载,之前使用Firefox浏览器对js文件进行调试,打断点时,只对当次调试有效,刷新之后,由于动态加载的js文件(文件名后面加了一 ...
hbase性能调优之压缩测试
文章概述: 1.顺序写 2.顺序读 3.随机写 4.随机读 5.SCAN数据 0 性能测试工具 hbase org.apache.hadoop.hbase.PerformanceEvaluation ...
写一个Windows上的守护进程（1）开篇
写一个Windows上的守护进程(1)开篇最近由于工作需要,要写一个守护进程,主要就是要在被守护进程挂了的时候再把它启起来.说起来这个功能是比较简单的,但是我前一阵子写了好多现在回头看起来比较糟糕的 ...
[Effective Modern C++] Item 4. Know how to view deduced types - 知道如何看待推断出的类型
条款四知道如何看待推断出的类型基础知识有三种方式可以知道类型推断的结果: IDE编辑器编译器诊断运行时输出使用typeid()以及std::type_info::name可以获取变量的类型 ...
mysql学习（补充）
修改表名 rename table olderName to newerName; \c 结束不执行设置字符集 set names gbk; mysql类型数值型属性修饰符 zerofill u ...
weblogic上部署应用程序
weblogic上部署应用程序有三种方法: 一:修改配置文件config.xml在文件中加入如下代码片段: <app-deployment> <name>FAB</nam ...
基于java的InputStream.read(byte[] b,int off,int len)算法学习
public int read(byte[] b, int off, int len) throws IOException 将输入流中最多 len 个数据字节读入字节数组.尝试读取多达 len 字节 ...