最新的lucene 3.0的field是这样的:

Field options for indexing
Index.ANALYZED – use the analyzer to break the Field’s value into a stream of separate tokens and make each token searchable.
Index.NOT_ANALYZED – do index the field, but do not analyze the String. Instead, treat the Field’s entire value as a single token and make that token searchable. 
Index.ANALYZED_NO_NORMS – an advanced variant of Index.ANALYZED which does not store norms information in the index. 
Index.NOT_ANALYZED_NO_NORMS – just like , but also do not store Norms.
Index.NO – don’t make this field’s value available for searching at all.

Field options for storing fields
Store.YES — store the value. When the value is stored, the original String in its entirety is recorded in the index and may be retrieved by an IndexReader.
Store.NO – do not store the value. This is often used along with Index.ANALYZED to index a large text field that doesn’t need to be retrieved in its original form.

Field options for term vectors
TermVector.YES – record the unique terms that occurred, and their counts, in each document, but do not store any positions or offsets information.
TermVector.WITH_POSITIONS – record the unique terms and their counts, and also the positions of each occurrence of every term, but no offsets.
TermVector.WITH_OFFSETS – record the unique terms and their counts, with the offsets (start & end character position) of each occurrence of every term, but no positions.
TermVector.WITH_POSITIONS_OFFSETS – store unique terms and their counts, along with positions and offsets.
TermVector.NO – do not store any term vector information.
If Index.NO is specified for a field, then you must also specify TermVector.NO.

具一些例子来说明这些怎么用
Index                   Store  TermVector                                Example usage 
NOT_ANALYZED     YES         NO                                        Identifiers (file names, primary keys),
                                                                                         Telephone and Social Security
                                                                                         numbers, URLs, personal names, Dates
ANALYZED              YES     WITH_POSITIONS_OFFSETS    Document title, document abstract
ANALYZED              NO      WITH_POSITIONS_OFFSETS    Document body
NO                         YES        NO                                        Document type, database primary key
NOT_ANALYZED     NO         NO                                         Hidden keywords

When Lucene builds the inverted index, by default it stores all necessary information to implement the Vector Space model. This model requires the count of every term that occurred in the document, as well as the positions of each occurrence (needed for phrase searches).
You can tell Lucene to skip indexing the term frequency and positions by calling:
Field.setOmitTermFreqAndPositions(true)

摘自:http://www.cnblogs.com/fxjwind/archive/2011/07/04/2097705.html

lucene Index Store TermVector 说明的更多相关文章

  1. ElasticSearch 2 (10) - 在ElasticSearch之下(深入理解Shard和Lucene Index)

    摘要 从底层介绍ElasticSearch Shard的内部原理,以及回答为什么使用ElasticSearch有必要了解Lucene的内部工作方式? 了解ElasticSearch API的代价 构建 ...

  2. Lucene——Field.Store(存储域选项)及Field.Index(索引选项)

    Field.Store.YES或者NO(存储域选项) 设置为YES表示或把这个域中的内容完全存储到文件中,方便进行文本的还原 设置为NO表示把这个域的内容不存储到文件中,但是可以被索引,此时内容无法完 ...

  3. Lucene Index Search

    转发自:  https://my.oschina.net/u/3777556/blog/1647031 什么是Lucene?? Lucene 是 apache 软件基金会发布的一个开放源代码的全文检索 ...

  4. 使用Lucene.Net实现全文检索

    使用Lucene.Net实现全文检索 目录 一 Lucene.Net概述 二 分词 三 索引 四 搜索 五 实践中的问题 一 Lucene.Net概述 Lucene.Net是一个C#开发的开源全文索引 ...

  5. Lucene教程具体解释

    (建立索引)] )中生成的索引文件的存放地址.详细步骤简单介绍例如以下: 1.创建Directory对象,索引目录 2.创建IndexSearch对象,建立查询(參数是Directory对象) 3.创 ...

  6. lucene 中关于Store.YES 关于Store.NO的解释

    总算搞明白 lucene 中关于Store.YES  关于Store.NO的解释了 一直对Lucene Store.YES不太理解,网上多数的说法是存储字段,NO为不存储. 这样的解释有点郁闷:字面意 ...

  7. 解决org.apache.lucene.store.AlreadyClosedException: this Directory is closed

    在Lucene中,关闭一个IndexWriter时抛出AlreadyClosedException异常: org.apache.lucene.store.AlreadyClosedException: ...

  8. Lucene教程(转)

    Lucene教程 1 lucene简介1.1 什么是lucene    Lucene是一个全文搜索框架,而不是应用产品.因此它并不像www.baidu.com 或者google Desktop那么拿来 ...

  9. Lucene.net站内搜索—5、搜索引擎第一版实现

    目录 Lucene.net站内搜索—1.SEO优化 Lucene.net站内搜索—2.Lucene.Net简介和分词Lucene.net站内搜索—3.最简单搜索引擎代码Lucene.net站内搜索—4 ...

随机推荐

  1. Linux性能分析Top

    前言 在实际开发中,有时候会收到一些服务的监控报警,比如CPU飙高,内存飙高等,这个时候,我们会登录到服务器上进行排查.本篇博客将涵盖这方面的知识:Linux性能工具. 一次线上问题排查模拟 背景:服 ...

  2. Scrapy学习-9-FromRequest

    用FromRequest模拟登陆知乎网站 实例 默认登陆成功以后的请求都会带上cookie # -*- coding: utf-8 -*- import re import json import d ...

  3. luogu P3512 [POI2010]PIL-Pilots

    题目描述 In the Byteotian Training Centre, the pilots prepare for missions requiring extraordinary preci ...

  4. Spark学习(一) Spark初识

    一.官网介绍 1.什么是Spark 官网地址:http://spark.apache.org/ Apache Spark™是用于大规模数据处理的统一分析引擎. 从右侧最后一条新闻看,Spark也用于A ...

  5. mac 安装ANT

    http://blog.csdn.net/crazybigfish/article/details/18215439 1.下载ant:官网下载 当前最新版是Apache Ant 1.9.3,可以下载那 ...

  6. win7配置java环境变量

    http://jingyan.baidu.com/article/9f63fb91d87fb0c8400f0e93.html 安装JDK,从Oracel官方网站上下载,也可以通过搜索,进入链接.下载完 ...

  7. 怎样查询锁表的SQL

    通过以下的语句查询出锁表的SQL: select l.session_id sid, s.serial#,        l.locked_mode,        l.oracle_username ...

  8. Posix信号量操作函数

    Posix信号量: 分类: Posix有名信号量:使用Posix IPC名字标识,可用于线程或进程间同步Posix基于内存的信号量:存放在共享内存区中,可用于进程或线程间的同步 sem_open(). ...

  9. Arcgis Engine(ae)接口详解(8):临时元素(element)

    //主地图的地图(map)对象 IMap map = null; IActiveView activeView = null; //IGraphicsContainer用于操作临时元素,可以通过map ...

  10. (转)OutOfMemory时抓取heap 快照

    转自:https://testerhome.com/topics/579 首先说一下,在程序没有崩溃的时候如何抓取heap快照.这个大家应该都知道,在ddms中自带此功能.   见上图首先我们选中一个 ...