lucene IndexOptions可以设置DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS DOCS,ES里也可以设置
org.apache.lucene.index
Enum Constants Enum Constant and Description DOCS_AND_FREQS
Only documents and term frequencies are indexed: positions are omitted.DOCS_AND_FREQS_AND_POSITIONS
Indexes documents, frequencies and positions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
Indexes documents, frequencies, positions and offsets.DOCS_ONLY
Only documents are indexed: term frequencies and positions are omitted.
Java Code Examples for org.apache.lucene.index.IndexOptions
Project: languagetool File: EmptyLuceneIndexCreator.java View source code | 6 votes | ![]() ![]() |
public static void main(String[] args) throws IOException {
if (args.length != 1) {
System.out.println("Usage: " + EmptyLuceneIndexCreator.class.getSimpleName() + " <indexPath>");
System.exit(1);
}
Analyzer analyzer = new StandardAnalyzer();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
Directory directory = FSDirectory.open(new File(args[0]).toPath());
IndexWriter writer = new IndexWriter(directory, config); FieldType fieldType = new FieldType();
fieldType.setIndexOptions(IndexOptions.DOCS);
fieldType.setStored(true);
Field countField = new Field("totalTokenCount", String.valueOf(0), fieldType);
Document doc = new Document();
doc.add(countField);
writer.addDocument(doc); writer.close();
}
index_options are "options" for the index you are searching on, a
datastructure that holds "terms" to document lists (posting lists).
TermVectors are a datastructure that gives you the "terms" for a given
document and in addition their position in the document as well as their
start and end character offsets. Now the index (each field has such an
index) holds a sorted list of terms and each term points to a posting list.
these posting lists are a list of documents that contain the term. On the
posting list you can also store information like frequencies (how often did
term Y occur in document X -> useful for scoring) as well as "positions"
(at which position did term Y occur in document X -> this is required fo
phrase & span queries).
if you have for instance a field that you only use for filtering you don't
need freqs and postions so documents only will do the job. In an index the
position information is the biggest piece of data usually aside stored
fields. If you don't do phrase queries or spans you don't need them at all
so safe the disk space and improve perf by only use docs and freqs. In
previous version it wasn't possible to have only freqs but no positions
(index_options supersede omit_term_frequencies_and_positions) so this is an
improvement overall since the most common usecase might only need freqs but
no positions.
1:term_vector
TermVector.YES: Only store number of occurrences.
TermVector.WITH_POSITIONS: Store number of occurrence and positions of terms, but no offset.
TermVector.WITH_OFFSETS: Store number of occurrence and offsets of terms, but no positions.
TermVector.WITH_POSITIONS_OFFSETS:number of occurrence and positions , offsets of terms.
TermVector.NO:Don't store any term vector information.
2: index_options
Allows to set the indexing options, possible values are docs (only doc numbers are indexed), freqs (doc numbers and term frequencies), and positions (doc numbers, term frequencies and positions). Defaults to positions for analyzed fields, and to docs for not_analyzed fields. It is also possible to set it to offsets (doc numbers, term frequencies, positions and offsets).
lucene IndexOptions可以设置DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS DOCS,ES里也可以设置的更多相关文章
- ES里设置索引中倒排列表仅仅存文档ID——采用docs存储后可以降低pos文件和cfs文件大小
index_options The index_options parameter controls what information is added to the inverted index, ...
- 在package.json里面的script设置环境变量,区分开发及生产环境。注意mac与windows的设置方式不一样
在package.json里面的script设置环境变量,区分开发及生产环境. 注意mac与windows的设置方式不一样. "scripts": { "publish- ...
- 导航栏和里面的View设置的是同一颜色值,实际运行又不一样.
导航栏和里面的View设置的是同一颜色值,实际运行又不一样.如何保证两者的颜色一致呢? 答案就是:( navigationBar.translucent = NO; ) 去除 导航条的分割线(黑 ...
- 14.3.3 Locks Set by Different SQL Statements in InnoDB 不同的SQL语句在InnoDB里的锁设置
14.3.3 Locks Set by Different SQL Statements in InnoDB 不同的SQL语句在InnoDB里的锁设置 locking read, 一个UPDATE,或 ...
- Ubuntu里字符编码设置
Ubuntu里字符编码设置 Ubuntu系统在默认的状况下只支持中文UTF-8编码,但是我们写的一些文档,还有java代码编译时采用gbk编码.所以需要修改.步骤如下: www.2cto.com ...
- spring里的事物设置
有的人说事物在spring里设置有两种,其实事物设置在spring配置文件中共有五种方式:第一种方式:每个Bean都有一个代理第二种方式:所有Bean共享一个代理基类第三种方式:使用拦截器第四种方式: ...
- FL studio里的项目设置介绍
FL studio作为具有众多音乐功能,能够制作多轨音频录制,排序和混音的一款专业软件,我们可以借助VST主机,灵活的调音台,高级MIDI和ReWire支持,来创建专业品质的各种音乐曲目. 而今天我们 ...
- Android ViewPager里的所有图片设置监听打开同一活动显示不同图片
Android ViewPager里的所有图片设置监听请看前一文章 为了省时所以2层菜单只做一个点击任意图片后显示相应图片的活动 关键点是每个点击对应的图片如何传参给显示的活动 因为只启动一个活动,所 ...
- 在tomcat启动时解析xml文件,获取特定标签的属性值,并将属性值设置到静态变量里
这里以解析hibernate.cfg.xml数据库配置信息为例,运用dom4j的解析方式来解析xml文件. 1.在javaWeb工程里新建一个java类,命名为GetXmlValue.java,为xm ...
随机推荐
- 【转载】C#通过StartWith和EndWith方法判断字符串是否以特定字符开始或者结束
C#开发过程中针对字符串String类型的操作是常见操作,有时候业务需要判断某个字符串是否以特定字符开头或者特定字符结束,此时就可使用StartsWith方法来判断目标字符串是否以特定字符串开头,通过 ...
- 【转载】Windows检测到IP地址冲突
今天在使用电脑的过程中,突然弹出个提示,Windows检测到IP地址冲突,此网络中的另一台计算机与该计算机的IP地址相同.联系你的网络管理员解决此问题,有关详细信息,请参阅Windows系统日志.查阅 ...
- NodeList和HTMLCollection区别
关于DOM集合接口,主要不同在于HTMLCollection是元素集合而NodeList是节点集合(既包括元素,也包括节点). 规定一下结果是: . node.childNodes 结果返回类型是 N ...
- Android NDK 学习之在C中调用Java的变量和静态变量
本博客主要是在Ubuntu 下开发,且默认你已经安装了Eclipse,Android SDK, Android NDK, CDT插件. 在Eclipse中添加配置NDK,路径如下Eclipse-> ...
- OpenStack kilo版(7) 部署dashboard
安装dashboard root@controller:~# apt-get install openstack-dashboard 配置 /etc/openstack-dashboard/loc ...
- ipv4与ipv6 Inet4Address类和Inet6Address类
在设置本地IP地址的时候,一些人会疑惑IPv4与IPv6的区别是什么?下面由学习啦小编为你分享ipv4与ipv6的区别的相关内容,希望对大家有所帮助. ipv4与ipv6的区别 在windows 7以 ...
- DB2备份恢复schema
场景:日常中开发同步生成环境或者环境切换都需要进行表结构.存储.数据等等的迁移,本文为表.视图.包.函数.存储等统一备份及恢复的操作. 备份: 登录数据库所在服务器,或者可远程连接需备份数据库的服务器 ...
- IP地址的总结
目前所使用的是IPV4,它是一个32位的整数,一般表示为 W.X.Y.Z格式,分为2部分,网络号和主机号,正是有了这种分层的结构,才支持了组播了淡泊,他是internet最终地址. 举例:192.16 ...
- linux网络编程之posix条件变量
今天来学习posix的最后一个相关知识----条件变量,言归正传. 下面用一个图来进一步描述条件变量的作用: 为什么呢? 这实际上可以解决生产者与消费者问题,而且对于缓冲区是无界的是一种比较理解的解决 ...
- P1967 货车运输[生成树+LCA]
题目描述 A国有n座城市,编号从 1到n,城市之间有 m 条双向道路.每一条道路对车辆都有重量限制,简称限重.现在有 q* 辆货车在运输货物, 司机们想知道每辆车在不超过车辆限重的情况下,最多能运多重 ...