lucene3.0_IndexSearcher排序

系列汇总：

lucene3.0_基础使用及注意事项汇总

IndexSearcher排序

本文主要讲解：

1.IndexSearcher中和排序相关的方法及sort类、SortField类（api级别）；

2.按文档得分进行排序；

3.按文档内部id进行排序；

4.数值型、日期型排序注意事项；

5.多Field排序；

6.通过改变boost值来改变文档的得分。

----------------------------------------------------------------------

1.IndexSearcher中和排序相关的方法及sort类、SortField类（api级别）；

用IndexSearcher直接排序一般使用方法

abstract TopFieldDocs search(Weight weight, Filter filter, int n, Sort sort)
Expert: Low-level search implementation with arbitrary sorting.

该方法只需传入一个sort实例。

Constructor Summary
`Sort()` Sorts by computed relevance.
`Sort(SortField... fields)` Sorts in succession by the criteria in each SortField.
`Sort(SortField field)` Sorts by the criteria in the given SortField.

在sort实例中，决定对哪个字段进行排序，按照什么数据类型排序，是升序还是降序，由SortField说的算。

两个最基础的构造方法如下：

SortField(String field, int type)
Creates a sort by terms in the given field with the type of term values explicitly given.

SortField(String field, int type, boolean reverse)
Creates a sort, possibly in reverse, by terms in the given field with the type of term values explicitly given.

通过这些类我们能很方便的完成检索结果的排序。

简单示例：

SortField sortF =new SortField("f", SortField.INT);

Sort sort =new Sort( sortF);

TopFieldDocs docs = searcher.search(query, null, 10, sort);

//遍历docs中的结果

2.按文档得分进行排序；

IndexSearcher默认的搜索就是按照文档得分进行排序的。

在SortField中将类型设置为SCORE即可。

static int SCORE
Sort by document score (relevancy).

3.按文档内部id进行排序；

每个文档进入索引的时候都会分配一个id号，有时可能会需要按照这个id号进行排序，

那么将SortField中类型设置为DOC即可。

static int DOC
Sort by document number (index order).

4.数值型、日期型排序注意事项；

假设莫一索引有五个文档，默认排序如下所示：

Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>

Document<stored,indexed<f:5> stored,indexed<f1:20101019> stored,indexed<a:fox>>

Document<stored,indexed<f:-2> stored,indexed<f1:20000128> stored,indexed<a:fox>>

Document<stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>

注意蓝色标识出来的字段是一个int型数据，红色标识出来的字段是一个8位的日期数据。默认排序中他是无序的。

使用INT类型对 f 字段进行排序：

结果：

Document<stored,indexed<f:-2> stored,indexed<f1:20000128> stored,indexed<a:fox>>

Document<stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>

Document<stored,indexed<f:5> stored,indexed<f1:20101019> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>

符合预期结果。

使用STRING类型对 f 字段进行排序：

Document<stored,indexed<f:-2> stored,indexed<f1:20000128> stored,indexed<a:fox>>

Document<stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>

Document<stored,indexed<f:5> stored,indexed<f1:20101019> stored,indexed<a:fox>>

第五条数据排序发生异常，不符合预期结果。

因此排序时要特别注意类型的选择。

使用INT类型对 f1 字段进行排序：

结果：

Document<stored,indexed<f:-2> stored,indexed<f1:20000128> stored,indexed<a:fox>>

Document<stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>

Document<stored,indexed<f:5> stored,indexed<f1:20101019> stored,indexed<a:fox>>

符合预期结果。

注意点：

对日期、价格等数据排序都要选择合适的排序类型，不单单是满足业务的需要，而且用INT、FLOAT等数值型的排序

比STRING效率要高。

5.多Field排序；

...实例代码：

SortField sortF =new SortField("f", SortField.INT);

SortField sortF2 =new SortField("f1", SortField.INT);

Sort sort =new Sort(new SortField[]{sortF , sortF2});

TopFieldDocs docs = searcher.search(query, null, 10, sort);

结果：

Document<stored,indexed<f:-2> stored,indexed<f1:20000128> stored,indexed<a:fox>>

Document<stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>

Document<stored,indexed<f:5> stored,indexed<f1:20101019> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>

注意点：

先按照 f字段进行排序，如果 f字段值相等，再按照 f1字段进行排序。

这个顺序由 SortField数组中 SortField实例的顺序一致。

6.通过改变boost值来改变文档的得分。

默认排序（相关度排序），原始排序情况：

Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>

Document<stored,indexed<f:5> stored,indexed<f1:20101019> stored,indexed<a:fox>>

Document<stored,indexed<f:-2> stored,indexed<f1:20000128> stored,indexed<a:fox>>

Document<stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>

修改第5个文档的boost值。

doc5.setBoost(5);

然后再看看排序情况：

Document<stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>

Document<stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>

Document<stored,indexed<f:5> stored,indexed<f1:20101019> stored,indexed<a:fox>>

Document<stored,indexed<f:-2> stored,indexed<f1:20000128> stored,indexed<a:fox>>

可以看到从地到天了！

这个功能的商用价值很大，只能这么说...

lucene3.0_IndexSearcher排序的更多相关文章

利用Boost影响Lucene查询结果的排序
转自:http://catastiger.iteye.com/blog/803796 前提:不对结果做sort操作. 在搜索中,并不是所有的Document和Fields都是平等的.有些技术会要 ...
lucene查询排序结果原理总结
参考文章 Lucene3.0结果排序原理+操作+示例 Lucene的排序算法一句话总结lucene排序算法是什么样的关键几个概念参考文档: http://lucene.apache.org/co ...
【lucene系列学习】排序
用lucene3实现搜索多字段并排序功能(设置权重)
javascript中的Array对象 —— 数组的合并、转换、迭代、排序、堆栈
Array 是javascript中经常用到的数据类型.javascript 的数组其他语言中数组的最大的区别是其每个数组项都可以保存任何类型的数据.本文主要讨论javascript中数组的声明.转换 ...
iOS可视化动态绘制八种排序过程
前面几篇博客都是关于排序的,在之前陆陆续续发布的博客中,我们先后介绍了冒泡排序.选择排序.插入排序.希尔排序.堆排序.归并排序以及快速排序.俗话说的好,做事儿要善始善终,本篇博客就算是对之前那几篇博客 ...
JavaScript实现常用的排序算法
▓▓▓▓▓▓ 大致介绍由于最近要考试复习,所以学习js的时间少了 -_-||,考试完还会继续的努力学习,这次用原生的JavaScript实现以前学习的常用的排序算法,有冒泡排序.快速排序.直接插入排 ...
[C#][算法] 用菜鸟的思维学习算法 -- 马桶排序、冒泡排序和快速排序
用菜鸟的思维学习算法 -- 马桶排序.冒泡排序和快速排序 [博主]反骨仔 [来源]http://www.cnblogs.com/liqingwen/p/4994261.html 目录马桶排序(令人 ...
算法与数据结构(十三) 冒泡排序、插入排序、希尔排序、选择排序（Swift3.0版）
本篇博客中的代码实现依然采用Swift3.0来实现.在前几篇博客连续的介绍了关于查找的相关内容, 大约包括线性数据结构的顺序查找.折半查找.插值查找.Fibonacci查找,还包括数结构的二叉排序树以 ...
算法与数据结构(七) AOV网的拓扑排序
今天博客的内容依然与图有关,今天博客的主题是关于拓扑排序的.拓扑排序是基于AOV网的,关于AOV网的概念,我想引用下方这句话来介绍: AOV网:在现代化管理中,人们常用有向图来描述和分析一项工程的计划 ...

随机推荐

js中改变文档的层次结构(创建元素节点,添加结点,插入子节点,取代子节点,删除子节点)
<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8" ...
rook 删不掉的问题
# kubectl get crd -o yamlapiVersion: v1items:- apiVersion: apiextensions.k8s.io/v1beta1 kind: Custo ...
pg_hba.conf、pool_hba.conf 以及 pool_passwd 三者间的关系
pg_hba.conf.pool_hba.conf 以及 pool_passwd 三者间的关系: 1.pg_hba.conf.pool_hba.conf 以及 pool_passwd 三者关系 pg_ ...
bootstrop-datatime参数配置
<html> <head> <meta http-equiv="Content-Type" content="text/html; char ...
mysql添加表注释、字段注释、查看与修改注释
1 创建表的时候写注释create table test1( field_name int comment '字段的注释')comment='表的注释'; 2 修改表的注释alter table te ...
windows 7 64 bit 注册dll
手动需要run as admin, 也可以用下边的脚本自动注册 @echo off setlocal enableextensions set REGSVR= if defined PROCESSOR ...
关于java项目中的.project文件：
.project是项目文件,项目的结构都在其中定义,比如lib的位置,src的位置,classes的位置
CodeForces 687A NP-Hard Problem （二分图）
题意:给定 n 条边,然后让你把它分成两组,每组都有所有边的一个端点. 析:一开始我是先判定环,以为就不能成立,其实不是这样的,有环也行.用dfs进行搜索,并标记每一个端点,如果标记过并且和以前不一样 ...
聚合函数 listagg （超出长度限制时xmlagg）
表&数据 ),buy ),price NUMBER); ); ); ); 原来的结果 SELECT * FROM PEOPLEBUY ORDER BY PEOPLE; 想要的结果 SELECT ...
用Swift实现一款天气预报APP（二）
这个系列的目录: 用Swift实现一款天气预报APP(一) 用Swift实现一款天气预报APP(二) 用Swift实现一款天气预报APP(三) 上篇中主要讲了界面的一些内容,这篇主要讨论网络请求,获得 ...

lucene3.0_IndexSearcher排序

lucene3.0_IndexSearcher排序的更多相关文章

随机推荐

热门专题