2.Lucene3.6.2包介绍，第一个Lucene案例介绍，查看索引信息的工具lukeall介绍，Luke查看的索引库内容，索引查找过程

2014-12-07 23:39
2623人阅读
评论(0)
收藏
举报

.embody{
padding:10px 10px 10px;
margin:0 -20px;
border-bottom:solid 1px #ededed;
}
.embody_b{
margin:0 ;
padding:10px 0;
}
.embody .embody_t,.embody .embody_c{
display: inline-block;
margin-right:10px;
}
.embody_t{
font-size: 12px;
color:#999;
}
.embody_c{
font-size: 12px;
}
.embody_c img,.embody_c em{
display: inline-block;
vertical-align: middle;
}
.embody_c img{
width:30px;
height:30px;
}
.embody_c em{
margin: 0 20px 0 10px;
color:#333;
font-style: normal;
}

分类：

爬虫（8）

作者同类文章X

1 Lucen目录介绍

2
lucene-core-3.6.2.jar是lucene开发核心jar包

contrib 目录存放，包含一些扩展jar包

3
案例

建立第一个Lucene项目：lucene3_day1

（1）需要先将数据转换成为Document对象，每一个数据信息转换成为Field(String
name, String value, Field.Store store, Field.Indexindex)

（2）指定索引库位置Directorydirectory = FSDirectory.open(new
File("index"));// 当前Index目录

（3）分词器Analyzeranalyzer =
new StandardAnalyzer(Version.LUCENE_36);

（4）写入索引：

IndexWriterConfig indexWriterConfig =
new
IndexWriterConfig(

Version.LUCENE_36, analyzer);

IndexWriter indexWriter =
new IndexWriter(directory,indexWriterConfig);

//将document数据写入索引库

indexWriter.addDocument(document);

//关闭索引

indexWriter.close();

案例编写：

案例目录：

Article.java

package cn.toto.lucene.quickstart;

public
class Article {

private
int
id;

private String
title;

private String
content;

/**

* @return the
id

public
int getId() {

return
id;

}

/**

* @param id
the id to set

public
void setId(int
id) {

this.id
= id;

}

/**

* @return the
title

public String getTitle() {

return
title;

}

/**

* @param title
the title to set

public
void setTitle(String title) {

this.title
= title;

}

/**

* @return the
content

public String getContent() {

return
content;

}

/**

* @param content
the content to set

public
void setContent(String content) {

this.content
= content;

}

package cn.toto.lucene.quickstart;

import java.io.File;

import org.apache.lucene.analysis.Analyzer;

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.document.Document;

import org.apache.lucene.document.Field;

import org.apache.lucene.document.Field.Index;

import org.apache.lucene.document.Field.Store;

import org.apache.lucene.index.IndexWriter;

import org.apache.lucene.index.IndexWriterConfig;

import org.apache.lucene.store.Directory;

import org.apache.lucene.store.FSDirectory;

import org.apache.lucene.util.Version;

import org.junit.Test;

/**

*
@brief LuceneTest.java
测试Lucene的案例

*
@attention

*
@author
toto-pc

*
@date 2014-12-7

*
@note begin modify by
涂作权 2014/12/07 null

public
class LuceneTest {

@Test

public
void buildIndex()
throws Exception {

Article article = new Article();

article.setId(100);

article.setTitle("Lucene快速入门");

article.setContent("Lucene是提供了一个简单却强大的应用程式接口，"

+ "能够做全文检索索引和搜寻，在Java开发环境里Lucene是"
+

"一个成熟的免费的开放源代码工具。");

//
将索引数据转换成为Document对象（Lucene要求）

Document document = new Document();

document.add(new Field("id",
//
字段

article.getId() + "", Store.YES,
//
是否建立索引

Index.ANALYZED
//
表示使用分词索引

));

document.add(new Field("title",
article.getTitle(), Store.YES,Index.ANALYZED));

document.add(new Field("content",
article.getContent(), Store.YES, Index.ANALYZED));

//
建立索引库

//
索引目录位置

Directory directory = FSDirectory.open(new
File("index"));//
当前Index目录

//
分词器

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36);

//
写入索引

IndexWriterConfig indexWriterConfig = new IndexWriterConfig(

Version.LUCENE_36, analyzer);

IndexWriter indexWriter = new IndexWriter(directory,
indexWriterConfig);

//
将document数据写入索引库

indexWriter.addDocument(document);

//
关闭索引

indexWriter.close();

}

运行单元测试后的结果：

运行后index目录下的结果：

4
可以通过luke工具查看索引库中内容（它是一个jar包）

下载网址：http://code.google.com/p/luke/

打开方式：

如果用这种方式打不可以，可以用命令的方式打开文件，进入这个目录，选中Shift+鼠标右键—>此处打开命令窗口—>输入命令：java
-jar lukeall-3.5.0.jar

工具的截图如下：

点击OK后的结果：

通过overview可以查看到索引信息，通过Document可以查看文档对象信息

5
查找

和上面的并集的query代码如下：

@Test

public
void searchIndex()
throws Exception

{

//建立Query对象--根据标题

String queryString = "Lucene";

//第一个参数，版本号

//第二个参数，字段

//第三个参数，分词器

Analyzer analyzer = new
StandardAnalyzer(Version.LUCENE_36);

QueryParser queryParser = new QueryParser(Version.LUCENE_36,"title",analyzer);

Query query = queryParser.parse(queryString);

//根据Query查找

//
索引目录位置

Directory directory = FSDirectory.open(new
File("index"));

IndexSearcher indexSearcher = new IndexSearcher(IndexReader.open(directory));

//条数据

TopDocs topDocs = indexSearcher.search(query, 100);

System.out.println("满足结果记录条数："
+ topDocs.totalHits);

//获取结果

ScoreDoc[] scoreDocs = topDocs.scoreDocs;

for (int
i = 0; i < scoreDocs.length; i++) {

//先获得Document下标

int docID = scoreDocs[i].doc;

Document document = indexSearcher.doc(docID);

System.out.println("id:"
+ document.get("id"));

System.out.println("title:"
+ document.get("title"));

System.out.println("content:"
+ document.get("content"));

}

indexSearcher.close();

}

运行结果：

Luke查看的索引库内容：

索引库中信息，包括两大部分：

A
索引词条信息

B
文档对象信息

每个Field中都存在一个Store和一个Index
索引内容和Document内容有什么关系

查找时，通过索引内容
查找
文档对象信息

索引的查找过程

顶: 0

踩: 0

Lucene3.6.2包介绍，第一个Lucene案例介绍，查看索引信息的工具lukeall介绍，Luke查看的索引库内容，索引查找过程的更多相关文章

2.Lucene3.6.2包介绍，第一个Lucene案例介绍，查看索引信息的工具lukeall介绍，Luke查看的索引库内容，索引查找过程
1 Lucen目录介绍 2 lucene-core-3.6.2.jar是lucene开发核心jar包 contrib 目录存放,包含一些扩展jar包 3 案例建立第一个Lucene项目 ...
top命令查看线程信息和jstack使用介绍
top -Hp pid可以查看某个进程的线程信息 -H 显示线程信息,-p指定pid jstack 线程ID 可以查看某个线程的堆栈情况,特别对于hung挂死的线程,可以使用选项-F强制打印dump信 ...
一个简单好用的zabbix告警信息发送工具
之前使用邮件和短信发送zabbix告警信息,但告警信息无法实时查看或者无法发送,故障无法及时通知运维人员. 后来使用第三方微信接口发送信息,愉快地用了一年多,突然收费了. zabbix告警一直是我的痛 ...
[置顶] 一个简单好用的zabbix告警信息发送工具
之前使用邮件和短信发送zabbix告警信息,但告警信息无法实时查看或者无法发送,故障无法及时通知运维人员. 后来使用第三方微信接口发送信息,愉快地用了一年多,突然收费了. zabbix告警一直是我的痛 ...
lucene 全文检索工具的介绍
Lucene:全文检索工具:这是一种思想,使用的是C语言写出来的 1.Lucene就是apache下的一个全文检索工具,一堆的jar包,我们可以使用lucene做一个谷歌和百度一样的搜索引擎系统 2. ...
Lucene介绍及简单入门案例（集成ik分词器）
介绍 Lucene是apache软件基金会4 jakarta项目组的一个子项目,是一个开放源代码的全文检索引擎工具包,但它不是一个完整的全文检索引擎,而是一个全文检索引擎的架构,提供了完整的查询引擎和 ...
第一个lucene程序，把一个信息写入到索引库中、根据关键词把对象从索引库中提取出来、lucene读写过程分析
新建一个Java Project :LuceneTest 准备lucene的jar包,要加入的jar包至少有: 1)lucene-core-3.1.0.jar (核心包) 2) lucene- ...
Dubbo入门介绍---搭建一个最简单的Demo框架
Dubbo入门---搭建一个最简单的Demo框架置顶 2017年04月17日 19:10:44 是Guava不是瓜娃阅读数:320947 标签: dubbozookeeper 更多个人分类: D ...
Fiddler抓包工具详细介绍
本文转自:http://www.cnblogs.com/Chilam007/p/6985379.html 一.Fiddler与其他抓包工具的区别 1.Firebug虽然可以抓包,但是对于分析http请 ...

随机推荐

全面的framebuffer详解
一.FrameBuffer的原理 FrameBuffer 是出现在 2.2.xx 内核当中的一种驱动程序接口. Linux是工作在保护模式下,所以用户态进程是无法象DOS那样使用显卡BIO ...
Hibernate类没有找到序列化器解决方案
Hibernate类没有找到序列化器解决方案异常信息类似如下 No serializer found for class org.hibernate.proxy.pojo.javassist.Jav ...
【Codeforces Round #482 (Div. 2) C】Kuro and Walking Route
[链接] 我是链接,点我呀:) [题意] 在这里输入题意 [题解] 把x..y这条路径上的点标记一下. 然后从x开始dfs,要求不能走到那些标记过的点上.记录节点个数为cnt1(包括x) 然后从y开始 ...
HTML5与后台服务器的数据流动问题
编辑中,尚未完稿...2017.7.14 1345 很多前端开发出来的HTML5可能对于后台开发者来说,并不是很清楚,也许像我一样一知半解.而且真的让人很糊涂的地方就是前端的JS如何与后端的数据库进行 ...
[Oracle] Merge语句
Merge的语法例如以下: MERGE [hint] INTO [schema .] table [t_alias] USING [schema .] { table | view | subquer ...
openfiler作为文件server，实现ISCSI共享存储
还是不能发图.这是第二篇.图文在这个地址:http://download.csdn.net/detail/weimingyu945/8089893 1 登陆首先登陆openfiler的we ...
CentOS6安装glibc-2.14，错误安装libc.so.6丢失急救办法
到http://ftp.gnu.org/gnu/glibc/下载glibc-2.14.tar.xz 将glibc-2.14.tar.gz 上传到/home目录下 tar glibc-2.14.tar. ...
iOS-UITextField 全面解析
iOS中UITextField 使用全面解析建议收藏,用到的时候来这里一查就都明白了 //初始化textfield并设置位置及大小 UITextField *text = [[UITextField ...
HttpClient get和HttpClient Post请求的方式获取服务器的返回数据
1.转自:https://blog.csdn.net/alinshen/article/details/78221567?utm_source=blogxgwz4 /* * 演示通过HttpClie ...
查看typedef类型
typedef unsigned long int NUM; #include <iostream> using namespace std; NUM x; cout << t ...

Lucene3.6.2包介绍，第一个Lucene案例介绍，查看索引信息的工具lukeall介绍，Luke查看的索引库内容，索引查找过程

2.Lucene3.6.2包介绍，第一个Lucene案例介绍，查看索引信息的工具lukeall介绍，Luke查看的索引库内容，索引查找过程

Lucene3.6.2包介绍，第一个Lucene案例介绍，查看索引信息的工具lukeall介绍，Luke查看的索引库内容，索引查找过程的更多相关文章

随机推荐

热门专题