【solr专题之二】配置文件:solr.xml solrConfig.xml schema.xml 分类: H4_SOLR/LUCENCE 2014-07-23 21:30 1959人阅读 评论(0) 收藏
1、关于默认搜索域
element may be removed。
<str name="qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
</str>
由于content不占任何的权重,因此如果某个文档只在content中包含关键字的话,搜索结果并不会返回这个文档。因此,对于nutch提取的索引来说,要增加content的权重,以及url的权重(如果需要的话):
<str name="qf">
content^1.0 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
</str>
二、Search Handler
- <requestHandler name="/browse" class="solr.SearchHandler">
- <lst name="defaults">
- <str name="echoParams">explicit</str>
- <!-- VelocityResponseWriter settings -->
- <str name="wt">velocity</str>
- <str name="v.template">browse</str>
- <str name="v.layout">layout</str>
- <str name="title">Solritas_test</str>
- <!-- Query settings -->
- <str name="defType">edismax</str>
- <str name="qf">
- text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
- title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
- </str>
- <str name="df">content</str>
- <str name="mm">100%</str>
- <str name="q.alt">*:*</str>
- <str name="rows">10</str>
- <str name="fl">*,score</str>
- <!--more like this setting-->
- <str name="mlt.qf">
- text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
- title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
- </str>
- <str name="mlt.fl">text,features,name,sku,id,manu,cat,title,description,keywords,author,resourcename</str>
- <int name="mlt.count">3</int>
- <!-- Faceting defaults -->
- <str name="facet">on</str>
- <str name="facet.field">cat</str>
- <str name="facet.field">manu_exact</str>
- <str name="facet.field">content_type</str>
- <str name="facet.field">author_s</str>
- <str name="facet.query">ipod</str>
- <str name="facet.query">GB</str>
- <str name="facet.mincount">1</str>
- <str name="facet.pivot">cat,inStock</str>
- <str name="facet.range.other">after</str>
- <str name="facet.range">price</str>
- <int name="f.price.facet.range.start">0</int>
- <int name="f.price.facet.range.end">600</int>
- <int name="f.price.facet.range.gap">50</int>
- <str name="facet.range">popularity</str>
- <int name="f.popularity.facet.range.start">0</int>
- <int name="f.popularity.facet.range.end">10</int>
- <int name="f.popularity.facet.range.gap">3</int>
- <str name="facet.range">manufacturedate_dt</str>
- <str name="f.manufacturedate_dt.facet.range.start">NOW/YEAR-10YEARS</str>
- <str name="f.manufacturedate_dt.facet.range.end">NOW</str>
- <str name="f.manufacturedate_dt.facet.range.gap">+1YEAR</str>
- <str name="f.manufacturedate_dt.facet.range.other">before</str>
- <str name="f.manufacturedate_dt.facet.range.other">after</str>
- <!-- Highlighting defaults -->
- <str name="hl">on</str>
- <str name="hl.fl">content features title name</str>
- <str name="hl.encoder">html</str>
- <str name="hl.simple.pre"></str>
- <str name="hl.simple.post"></str>
- <str name="f.title.hl.fragsize">0</str>
- <str name="f.title.hl.alternateField">title</str>
- <str name="f.name.hl.fragsize">0</str>
- <str name="f.name.hl.alternateField">name</str>
- <str name="f.content.hl.snippets">3</str>
- <str name="f.content.hl.fragsize">200</str>
- <str name="f.content.hl.alternateField">content</str>
- <str name="f.content.hl.maxAlternateFieldLength">750</str>
- <!-- Spell checking defaults -->
- <str name="spellcheck">on</str>
- <str name="spellcheck.extendedResults">false</str>
- <str name="spellcheck.count">5</str>
- <str name="spellcheck.alternativeTermCount">2</str>
- <str name="spellcheck.maxResultsForSuggest">5</str>
- <str name="spellcheck.collate">true</str>
- <str name="spellcheck.collateExtendedResults">true</str>
- <str name="spellcheck.maxCollationTries">5</str>
- <str name="spellcheck.maxCollations">3</str>
- </lst>
- <!-- append spellchecking to our list of components -->
- <arr name="last-components">
- <str>spellcheck</str>
- </arr>
- </requestHandler>
1、SearchHandler是reqestHandler中的一种,它以requestHandler作为顶层元素。
2、二级元素包括first-components, last-components, defautls等。
3、Velocity的配置
- <!-- VelocityResponseWriter settings -->
- <str name="wt">velocity</str>
- <str name="v.template">browse</str>
- <str name="v.layout">layout</str>
- <str name="title">Solritas_test</str>
wt:指定返回搜索结果的格式
v.template: template name to use, without the .vm suffix. If not specified, "default"[.vm] will be used.
v.template.<name>: overrides a file system template
debugQuery: if true, default view displays explanations for each hit and additional debugging information in the footer.
v.json: Escapes and wraps Velocity generated response with v.json parameter as a JavaScript function.
v.layout: Template name that wraps main template (v.template). Main template renders to a $content that can be used in layout template.
v.base_dir: overwrites default template load path (conf/velocity/).
v.properties: specifies a Velocity properties file to be applied, found using the Solr resource loader mechanism. If not specified, no .properties file is loaded. Example: v.properties=velocity.properties
where velocity.properties can be found using Solr's resource loader mechanism, for example in the conf/ directory (not conf/velocity which is for templates only). The .properties file could also be located inside a JAR in the lib/ directory, or other locations.v.contentType: sets the value of the HTTP response's Content-Type header (in case (x)html pages should be UTF-8 (instead of ISO-8859-1) encoded, make sure you set this option to text/xml;charset=UTF-8 (for
XHTML) and text/html;charset=UTF-8 (for HTML), respectively)
velocity的其余配置参考:http://blog.csdn.net/jediael_lu/article/details/38039267。
4、搜索域qf
- <str name="qf">
- text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
- title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
- </str>
定义了从哪些域进行搜索,以及各个域之间的权重。
5、QueryParser的选择 defType,常用efType=lucene, defType=edismax
- <str name="defType">edismax</str>
6、默认搜索域:df
若无指定搜索域,则此域作为默认的搜索域。
df/qf/defaultSearchField比较:
(1)使用solrConfig中的df属性代替schema中的defaultSearchField。
(2)df is
the default field and will only take effect if the qf is
not defined.
7、默认的query
- <str name="q.alt">*:*</str>
q.alt: 当q字段为空时,用于设置缺省的query,通常设置q.alt为*:*。
8、 mm:minimal should match。Solr支持三种查询clause,即“必须出现”, “不能出现”和“可以出现”,分别对应于AND, -, OR。
- <str name="mm">100%</str>
When dealing with queries there are 3 types of "clauses" that Lucene knows about: mandatory, prohibited, and 'optional' (aka: "SHOULD") By default all words or phrases specified in the "q" param are treated as "optional" clauses unless they are preceeded by
a "+" or a "-". When dealing with these "optional" clauses, the "mm" option makes it possible to say that a certain minimum number of those clauses must match (mm). Specifying this minimum number can be done in complex ways, equating to ideas like...
At least 2 of the optional clauses must match, regardless of how many clauses there are: "2"
At least 75% of the optional clauses must match, rounded down: "75%"
If there are less than 3 optional clauses, they all must match; if there are 3 or more, then 75% must match, rounded up: "2<-25%"
If there are less than 3 optional clauses, they all must match; for 3 to 5 clauses, one less than the number of clauses must match, for 6 or more clauses, 80% must match, rounded down: "2<-1 5<80%"
Full details on the variety of complex expressions supported are explained in detail here.
In Solr 1.4 and prior, you should basically set mm=0 if you want the equivilent of q.op=OR, and mm=100% if you want the equivilent of q.op=AND. In 3.x and trunk the default value of mm is dictated by the q.op param
(q.op=AND => mm=100%; q.op=OR => mm=0%). Keep in mind the default operator is effected by your schema.xml <solrQueryParser defaultOperator="xxx"/> entry. In older versions of Solr the default value is 100% (all clauses must match)
9、每页返回的行数
- <str name="rows">10</str>
10、返回Field的集合
- <str name="fl">*,score</str>
fl: 是逗号分隔的列表,用来指定文档结果中应返回的 Field 集。默认为 “*”,指所有的字段。以上即返回所有域,而加上score。
11、对返回结果排序
(1)排序的字段必须是index=true
(2)<str name="sort">tstamp asc</str>
若此元素放在<default>中,则指定默认元素,query时可以改变。
若放在<invariant>中,则在query中也不可以改变。
这应该对其它元素同样适用。
参考:http://stackoverflow.com/questions/24966924/how-to-change-the-default-rank-field-from-score-to-other-filed-in-solr/24971353#24971353
版权声明:本文为博主原创文章,未经博主允许不得转载。
【solr专题之二】配置文件:solr.xml solrConfig.xml schema.xml 分类: H4_SOLR/LUCENCE 2014-07-23 21:30 1959人阅读 评论(0) 收藏的更多相关文章
- 【solr基础教程之二】索引 分类: H4_SOLR/LUCENCE 2014-07-18 21:06 3331人阅读 评论(0) 收藏
一.向Solr提交索引的方式 1.使用post.jar进行索引 (1)创建文档xml文件 <add> <doc> <field name="id"&g ...
- 【solr专题之三】Solr常见异常 分类: H4_SOLR/LUCENCE 2014-07-19 10:30 3223人阅读 评论(0) 收藏
1.RemoteSolrException: Expected mime type application/octet-stream but got text/html 现象: SLF4J: Fail ...
- 【Lucene4.8教程之二】索引 2014-06-16 11:30 3845人阅读 评论(0) 收藏
一.基础内容 0.官方文档说明 (1)org.apache.lucene.index provides two primary classes: IndexWriter, which creates ...
- C语言基础:二维数组 分类: iOS学习 c语言基础 2015-06-10 21:42 16人阅读 评论(0) 收藏
二维数组和一位数组类似. 定义: 数据类型 数组名[行][列]={{ },{ }....}; 定义时,一维(行)的长度可以省略,但是二维(列)的长度不可以省略.但是访问时,一定使用双下标. 二维数组的 ...
- AndroidManifest.xml中的application中的name属性 分类: android 学习笔记 2015-07-17 16:51 116人阅读 评论(0) 收藏
被这个不起眼的属性折磨了一天,终于解决了. 由于项目需要,要合并两个android应用,于是拷代码,拷布局文件,拷values,所有的都搞定之后程序还是频频崩溃,一直没有找到原因,学android时间 ...
- 【solr专题之四】关于VelocityResponseWriter 分类: H4_SOLR/LUCENCE 2014-07-22 12:32 1639人阅读 评论(0) 收藏
一.关于Velocity的基本配置 在Solr中,可以以多种方式返回搜索结果,如单纯的文本回复(XML.JSON.CSV等),也可以返回velocity,js等格式.而VelocityResponse ...
- 【solr专题之四】在Tomcat 中部署Solr4.x 分类: H_HISTORY 2014-07-17 16:08 1286人阅读 评论(0) 收藏
1.安装Tomcat (1)下载并解压至/opt/tomcat中 # cd /opt/jediael # tar -zxvf apache-tomcat-7.0.54.tar.gz # mv apac ...
- 【solr专题之一】Solr快速入门 分类: H4_SOLR/LUCENCE 2014-07-02 14:59 2403人阅读 评论(0) 收藏
一.Solr学习相关资料 1.官方材料 (1)快速入门:http://lucene.apache.org/solr/4_9_0/tutorial.html,以自带的example项目快速介绍发Solr ...
- hadoop配置文件的加载机制 分类: A1_HADOOP 2015-01-21 11:29 839人阅读 评论(0) 收藏
hadoop通过Configuration类来保存配置信息 1.通过Configuration.addResource()来加载配置文件 2.通过Configuration.get***()来获取配置 ...
随机推荐
- BZOJ2882
传送门:BZOJ2882(权限题) 最小表示法的模板. 传送门:周神论文 代码上的小细节见下. #include <cstdio> #include <cstdlib> #in ...
- 分析深圳电信的新型HTTP劫持方式
昨天深圳下了一天的暴雨,2014年的雨水真是够多的. 用户的资源就是金钱,怎的也要好好利用嘛不是? ISP的劫持手段真是花样百出.从曾经的DNS(污染)劫持到后来的共享检測.无不通过劫持正常的请求来达 ...
- 7.Node.js 创建第一个应用
转自:http://www.runoob.com/nodejs/nodejs-tutorial.html 如果我们使用PHP来编写后端的代码时,需要Apache 或者 Nginx 的HTTP 服务器, ...
- String类型转Long类型需要注意的问题
转自:https://blog.csdn.net/m819177045/article/details/52669785/
- Bitmap Image Graphics
Bitmap Image Graphics private void DrawImagePointF(PaintEventArgs e){ // Create image. Image new ...
- 一个统一将数据转换为JSON的方法
这是我得方法: 导包: import net.sf.json.JSONArray; import net.sf.json.JSONObject; public void writeJson(Objec ...
- Android Service com.android.exchange.ExchangeService has leaked ServiceConnection
启动Android项目的时候,clean Project的时候,报错: android.app.ServiceConnectionLeaked: Service com.android.exchan ...
- maven插件介绍之tomcat7-maven-plugin
tomcat7-maven-plugin插件的pom.xml依赖为: <dependency> <groupId>org.apache.tomcat.maven</gro ...
- 关于VUE的安装和一些简单属性
安装vue 安装前初始化package.json 主要用来描述自己的项目,记录安装过得文件有哪些,在当前文件夹下生产json 安装vue --save(-S)代表项目依赖 --save-dev(-D) ...
- POJ 1862 Stripies 贪心+优先队列
http://poj.org/problem?id=1862 题目大意: 有一种生物能两两合并,合并之前的重量分别为m1和m2,合并之后变为2*sqrt(m1*m2),现在给定n个这样的生物,求合并成 ...
