solr中的schema.xml(managed-schema)文件解读

solr 7.2.1版本managed-schema文件示例

<uniqueKey>id</uniqueKey>

唯一键字段,solr对每一个文档都赋予一个唯一标识符字段,避免产生重复索引,
我们可以将不重复且不变的字段设置为solr索引文档的主键

    <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />

    <!-- docValues are enabled by default for long type so we don't need to index the version field  -->

    <field name="_version_" type="plong" indexed="false" stored="false"/>

    <field name="_root_" type="string" indexed="true" stored="false" docValues="false" />

    <field name="_text_" type="text_general" indexed="true" stored="false" multiValued="true"/>

<field></field>代表索引文档的字段, name属性为字段名称(该名称唯一), type字段为该字段索引解析所用的分词器,

indexed为是否进行索引, stored为是否储存该字段的值用于日后的查询, multiValued为该字段是否有多个值

   <dynamicField name="*_i"  type="pint"    indexed="true"  stored="true"/>

    <dynamicField name="*_is" type="pints"    indexed="true"  stored="true"/>

    <dynamicField name="*_s"  type="string"  indexed="true"  stored="true" />

    <dynamicField name="*_ss" type="strings"  indexed="true"  stored="true"/>

    <dynamicField name="*_l"  type="plong"   indexed="true"  stored="true"/>

<dynamicField>与field意义相同,但其代表的是动态字段
动态字段:如果几个字段的属性处理name值不同,type indexed stored 等属性完全相同,
则可以只配置一个动态字段以减少配置,动态字段以*_开头或以_*结尾,例如*_s  s_* 
当配置了

<dynamicField name="*_s"  type="string"  indexed="true"  stored="true" />

时传入 name1_s 和 name2_s 两个字段的名称则都会匹配

但其在查询时必须明确指定字符串的字段

<field name="queryFieldCopy" type="text_en" indexed="true" stored="false" multiValued="true" /> 

<copyField source="_text_" dest="queryFieldCopy"/>

<copyField source="_root_" dest="queryFieldCopy"/>

<copyField> 复制字段,solr的复制字段允许将一个或多个字段值填充到一个字段中
将多个字段内容填充到一个字段
对同一字段内容进行不同的文本分析,创建一个新的可搜索字段
source为源字段 dest为目标字段
即将_text_字段和_root_字段复制到了queryFieldCopy字段,那么在对queryFieldCopy字段的搜索就相当于搜索了_text_和_root_两个字段

    <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">

      <analyzer type="index">

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <!-- in this example, we will only use synonyms at query time

        <filter class="solr.SynonymGraphFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>

        <filter class="solr.FlattenGraphFilterFactory"/>

        -->

        <!-- Case insensitive stop word removal.

        -->

        <filter class="solr.StopFilterFactory"

                ignoreCase="true"

                words="lang/stopwords_en.txt"

            />

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.EnglishPossessiveFilterFactory"/>

        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>

        <!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:

        <filter class="solr.EnglishMinimalStemFilterFactory"/>

          -->

        <filter class="solr.PorterStemFilterFactory"/>

      </analyzer>

      <analyzer type="query">

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>

        <filter class="solr.StopFilterFactory"

                ignoreCase="true"

                words="lang/stopwords_en.txt"

        />

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.EnglishPossessiveFilterFactory"/>

        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>

        <!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:

        <filter class="solr.EnglishMinimalStemFilterFactory"/>

          -->

        <filter class="solr.PorterStemFilterFactory"/>

      </analyzer>

    </fieldType>

Solr 分析器被指定为 schema.xml 配置文件中的<fieldType>元素的子元素（在与 solrconfig. xml 相同的 conf/ 目录中）。

solr中的schema.xml(managed-schema)文件解读的更多相关文章

idea中自定义设置xml的头文件的内容
因为在idea中新建的xml默认的头文件,有时候并不是我们需要的这时候可以通过自定义来解决. 如搭建hibernate的实体类的映射xml. 首先 fiel→settings出现如下框框在上面搜索 ...
解决 IDEA 中src下xml等资源文件无法读取的问题
该问题的实质是,idea对classpath的规定. 在eclipse中,把资源文件放在src文件夹下,是可以找到的: 但是在idea中,直接把资源文件放在src文件夹下,如果不进行设置,是不能被找到 ...
solr官方文档翻译系列之schema.xml配置介绍
常见的元素 <field name="weight" type="float" indexed="true" stored=" ...
SOLR企业搜索平台三（schema.xml配置和solrj的使用）
标签:solrj 原创作品,允许转载,转载时请务必以超链接形式标明文章原始出处 .作者信息和本声明.否则将追究法律责任.http://3961409.blog.51cto.com/3951409/8 ...
Solr的学习使用之（二）schema.xml等配置文件的解析
上一篇文章已经讲解了如何部署Solr,部署是部署完了,可是总觉得心里空空的,没底,里面有N多配置文件,比如schema.xml.solrConfig.xml.solr.xml and so on……都 ...
Spring的配置文件ApplicationContext.xml配置头文件解析
Spring的配置文件ApplicationContext.xml配置头文件解析原创 2016年12月16日 14:22:43 标签: spring配置文件 5446 spring中的applica ...
我与solr(五)--关于schema.xml中的相关配置的详解
先把文件的代码贴上来: <?xml version="1.0" encoding="UTF-8" ?>  ...
Solr中Schema.xml中文版
<?xml version="1.0" encoding="UTF-8" ?> <!-- Licensed to the Apache Sof ...
Solr 06 - Solr中配置使用IK分词器 (配置schema.xml)
目录 1 配置中文分词器 1.1 准备IK中文分词器 1.2 配置schema.xml文件 1.3 重启Tomcat并测试 2 配置业务域 2.1 准备商品数据 2.2 配置商品业务域 2.3 配置s ...

随机推荐

oauth2(spring security)报错method_not_allowed(Request method 'GET' not supported)解决方法
报错信息 <MethodNotAllowed> <error>method_not_allowed</error> <error_description> ...
HTML5中div，article，section的区别
最近正在学习html5,刚接触html5,感觉有点不适应,因为有一些标签改变了,特别是div, section article这三个标签,查了一些资料,也试着用html5和css3布局网页,稍微有点头 ...
Iahub and Xors Codeforces - 341D
二维线段树被卡M+T...于是去学二维树状数组区间更新区间查询树状数组维护数列区间xor的修改.删除(就是把原问题改成一维): 以下p*i实际都指i个p相xor,即(i&1)*pa表示原数列 ...
req.getParameter()无法获取参数（附前端json序列化）
问题:前端用Ajax的post方式想servlet传递参数,servlet的getParameter()方法无法获取参数. 前端代码: $.ajax({ url: '/TestWeb/addBook' ...
Mysql中的索引问题
索引的用途提高查询的效率,相当于在字典中建立的字母表或者偏旁部首表,这样查询当然比一行一行查询要快的多每个存储引擎可以建立索引的长度是不一样的,但每个表至少支持16个索引,总的索引长度至少为256 ...
RHEL 6.5----CDN（lumanger）
主机名 IP 服务 master 192.168.30.130 CDN(LuManager) slave 192.168.30.131 DNS 软件安装包下载地址及安装方法 http:// ...
[转]MVC 检测用户是否已经登录
本文转自:http://blog.csdn.net/jayzai/article/details/41252137 当我们访问某个网站的时候需要检测用户是否已经登录(通过Session是否为null) ...
Log4net系列二：Log4net邮件日志以及授权码
Log4net邮件发送上篇文章我们主要介绍Log4net生成文本格式,本篇文章主要配置邮箱发送.关于项目的引用,搭建我们就不在描述,如果不太清楚,请看上篇文章, 老规矩,我们现在配置文件中添加一个a ...
mvc报（检测到有潜在危险的 request.form 值）错的解决方案
今天在做项目中遇到了报( 检测到有潜在危险的 request.form 值 )错,百度过后解决了该问题,出此问题主要还是因为提交的Form中有HTML字符串,例如你在TextBox中输入了html标签 ...
协程和I/O模型
1.协程: 单线程实现并发在应用程序里控制多个任务的切换+保存状态优点: 应用程序级别速度要远远高于操作系统的切换缺点: 多个任务一旦有一个阻塞没有切换,整个线程都阻塞在原地该线程内的其他的任 ...

solr中的schema.xml(managed-schema)文件解读

solr中的schema.xml(managed-schema)文件解读的更多相关文章

随机推荐

热门专题