顾名思义,most_field就是匹配词干的字段数越多,分数越高,也可设置权重boost。

下面是简易公式(详细评分算法请参考:http://m.blog.csdn.net/article/details?id=50623948):

score=match_field1_score*boost+match_field2_score*boost+...match_fieldN_score*boost。

在很多情况下,这种搜索很有效,但存在一个弱点,就是当文档中的字段冗余信息过多,将会影响那些文档比较精炼,而且意思较为全面的分值,

不能使用operator和minimum_should_match来减少相关性低的doc的长尾问题,简单的来说就是按term匹配的个数取胜

例下:

搜索关键字“北京东路”,先下面的分词结果,我们知道它的词干为“北京”与“东路”:

curl   'localhost:9200/fullbiz_index/_analyze?analyzer=ik_smart&pretty=true' -d '{"text":"北京东路"}'
{
"tokens" : [
{
"token" : "text",
"start_offset" : 2,
"end_offset" : 6,
"type" : "ENGLISH",
"position" : 1
},
{
"token" : "北京",
"start_offset" : 9,
"end_offset" : 11,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "东路",
"start_offset" : 11,
"end_offset" : 13,
"type" : "CN_WORD",
"position" : 3
}
]
}
curl  'localhost:9200/fullbiz1/fullbizinfo/_search?pretty' -d '
{
"from" : 0,
"size" : 20,
"query" : {
"multi_match" : {
"query" : "北京东路",
"fields" : [ "title", "highlight", "tags", "address", "businessDistrict", "cuisineStyle" ],
"type" : "most_fields",
"minimum_should_match" : "70%",//这是指最少匹配词干占比,例如三个词干,只要配置了二个以上就算match,66.6%会啥入70%。二个词干或以下,只要匹配了一个就行。所以“北京东路”只要匹配了“北京”或“东路”都可得分
"analyzer" : "ik_smart" //ik有二种模式,一种是ik_max_word(最细词干法),ik_smart(最粗词干法),这里我们配置第二种,以更接近于业务结果。
}
},
"post_filter" : {
"bool" : {
"must" : [ {
"term" : {
"status" : 0
}
}, {
"term" : {
"hostDisplay" : 1
}
}, {
"term" : {
"cityId" : 2
}
}, {
"term" : {
"productType" : 3
}
} ]
}
}
}'
 
"hits" : [ {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "324239",
"_score" : 0.33371,
"_source":{"boost":1,"productId":24239,"productType":3,"subType":2,"title":"城市公牛(南京东路店)","viceTitle":"城市公牛(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"meal/2016/08/11/1470892987880.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":null,"status":0,"isFree":-1,"duration":"10:00:00-22:30:00","onlineTime":1470280723,"updateTime":1486951326,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":24239,"contactNumber":"13764741956","hostName":"城市公牛(南京东路店)","address":"南京东路300号L221-222室(河南中路口)","hostDisplay":1,"hostPicUrl":"meal/2016/08/11/1470892987880.jpg","hostSharePicUrl":"meal/2016/08/11/1470892987880.jpg","hostLatitude":"31.243455970586","hostLongitude":"121.49099099941","location":{"lat":"31.243455970586","lon":"121.49099099941"},"hostLatitudeGD":"31.237701","hostLongitudeGD":"121.484409","locationGD":{"lat":"31.237701","lon":"121.484409"},"headPics":"","catalogIds":null,"cuisineStyleId":41,"cuisineStyle":"西餐","hideMask":0,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":1,"orderNums":3,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":16000,"hostProductLabelIds":",1,2,4,5,7,8,9,12,13,14,15,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"2010年世博会加拿大馆特约餐厅\",\"加拿大简约西部乡村风格小酒馆餐厅\",\"家庭式的用餐氛围 80%均是外国食客\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-13T10:02:06.000+08:00"}
}, {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "392659",
"_score" : 0.31962717,
"_source":{"boost":1,"productId":92659,"productType":3,"subType":4,"title":"THAIBEAUTY美容连锁机构(南京东路店)","viceTitle":"THAIBEAUTY美容连锁机构(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2017/01/11/1484121279773528.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1484121281,"updateTime":1484202471,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":92659,"contactNumber":"021-63511876","hostName":"THAIBEAUTY美容连锁机构(南京东路店)","address":"南京东路580号6楼","hostDisplay":1,"hostPicUrl":"hostInfo/2017/01/11/1484121279773528.jpg","hostSharePicUrl":"hostInfo/2017/01/11/1484121279773528.jpg","hostLatitude":"31.241721400027","hostLongitude":"121.48585125776","location":{"lat":"31.241721400027","lon":"121.48585125776"},"hostLatitudeGD":"31.235887","hostLongitudeGD":"121.479289","locationGD":{"lat":"31.235887","lon":"121.479289"},"headPics":"","catalogIds":null,"cuisineStyleId":0,"cuisineStyle":"美容/SPA","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":284500,"hostProductLabelIds":",60,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"高端局部瘦身\",\"环境舒适 按摩师手法专业\",\"使用高品质产品\"]","isSeatBook":1,"lastUTCTimestamp":"2017-01-12T14:27:51.000+08:00"}
}, {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "364804",
"_score" : 0.31002828,
"_source":{"boost":1,"productId":64804,"productType":3,"subType":2,"title":"斗牛士(南京东路店)","viceTitle":"斗牛士(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2016/12/26/1482718008927949.png","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1482718014,"updateTime":1486569730,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":64804,"contactNumber":"021-33317136","hostName":"斗牛士(南京东路店)","address":"南京东路353号悦荟广场(原353店)7F","hostDisplay":1,"hostPicUrl":"hostInfo/2016/12/26/1482718008927949.png","hostSharePicUrl":"hostInfo/2016/12/26/1482718008927949.png","hostLatitude":"31.24210523683","hostLongitude":"121.49020262932","location":{"lat":"31.24210523683","lon":"121.49020262932"},"hostLatitudeGD":"31.236339","hostLongitudeGD":"121.483623","locationGD":{"lat":"31.236339","lon":"121.483623"},"headPics":"","catalogIds":null,"cuisineStyleId":41,"cuisineStyle":"西餐","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":12200,"hostProductLabelIds":",1,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"精选进口澳洲安格斯牛排\",\"严控0度低温 保证牛肉鲜嫩\",\"进口原切牛排保证牛肉口感与外观\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-09T00:02:10.000+08:00"}
.....
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "353771",
"_score" : 0.7784657,
"_source":{"boost":1,"productId":53771,"productType":3,"subType":2,"title":"九储堂创意中国菜(外滩店)","viceTitle":"九储堂创意中国菜(外滩店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2016/12/26/1482744127546461.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1482744132,"updateTime":1486738928,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"外滩","businessDistrictId":71,"hostId":53771,"contactNumber":"021-63308900","hostName":"九储堂创意中国菜(外滩店)","address":"北京东路398号新协通国际大酒店18楼","hostDisplay":1,"hostPicUrl":"hostInfo/2016/12/26/1482744127546461.jpg","hostSharePicUrl":"hostInfo/2016/12/26/1482744127546461.jpg","hostLatitude":"31.246247363994","hostLongitude":"121.48894308136","location":{"lat":"31.246247363994","lon":"121.48894308136"},"hostLatitudeGD":"31.240463","hostLongitudeGD":"121.48237","locationGD":{"lat":"31.240463","lon":"121.48237"},"headPics":"","catalogIds":null,"cuisineStyleId":25,"cuisineStyle":"创意菜","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":19100,"hostProductLabelIds":",1,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"新加坡同乐餐饮总厨胡于保先生主理\",\"大厅可容纳150人的宴会 包房5间\",\"靠窗座位亦可欣赏浦江两岸美景\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-10T23:02:08.000+08:00"}

而结果中有包含“北京东路”完整内容的文档却排在后面,这不科学,为什么会是这个结果,下面我们经过explain来看看评分计算:

curl  'localhost:9200/fullbiz1/fullbizinfo/_search?pretty&explain'  ....后面内容省略,和上面的请求是一样,只加了一个explain,以及size限制第一条,因为信息太多,只分析具体一个文档,下面我们直接看评分部分:

      "_explanation" : {
"value" : 0.33371,
"description" : "product of:",
"details" : [ {
"value" : 0.66742,
"description" : "sum of:",
"details" : [ {
"value" : 0.28481156,
"description" : "product of:",
"details" : [ {
"value" : 0.5696231,
"description" : "sum of:",
"details" : [ {
"value" : 0.5696231,
"description" : "weight(title:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.5696231,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.25448462,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 7.1626873,
"description" : "idf(docFreq=244, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 2.23834,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 7.1626873,
"description" : "idf(docFreq=244, maxDocs=116302)"
}, {
"value" : 0.3125,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
}, {
"value" : 0.067192085,
"description" : "product of:",
"details" : [ {
"value" : 0.13438417,
"description" : "sum of:",
"details" : [ {
"value" : 0.13438417,
"description" : "weight(address:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.13438417,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.1477382,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 4.158218,
"description" : "idf(docFreq=4942, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 0.90961015,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 4.158218,
"description" : "idf(docFreq=4942, maxDocs=116302)"
}, {
"value" : 0.21875,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
}, {
"value" : 0.3154164,
"description" : "product of:",
"details" : [ {
"value" : 0.6308328,
"description" : "sum of:",
"details" : [ {
"value" : 0.6308328,
"description" : "weight(businessDistrict:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.6308328,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.22633977,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 6.3705263,
"description" : "idf(docFreq=540, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 2.7871053,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 6.3705263,
"description" : "idf(docFreq=540, maxDocs=116302)"
}, {
"value" : 0.4375,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(3/6)"
} ]
}
} ]
}
}

从上面分析结果来看,排在前面的这些包含“南京东路”的文档,不是因为匹配度高,而是因为匹配的字段多,所以得分大于下面那个只包含一个“北京东路”字段的文档。

总结:most_field适应于那种字段之间信息差异较大的搜索匹配,像上面那种title中有“东路”,商圈、地址中也有“东路“,冗余信息较多。

Elasticsearch搜索之most_fields分析的更多相关文章

  1. Elasticsearch搜索之cross_fields分析

    cross_fields类型采用了一种以词条为中心(Term-centric)的方法,这种方法和best_fields及most_fields采用的以字段为中心(Field-centric)的方法有很 ...

  2. Elasticsearch搜索之best_fields分析

    顾名思义,best_field就是获取最佳匹配的field,另个可以通过tie_breaker来控制其他field的得分,boost可以设置权重(默认都为1). 下面从宏观上来讲的简单公式: scor ...

  3. 一次 ElasticSearch 搜索优化

    一次 ElasticSearch 搜索优化 1. 环境 ES6.3.2,索引名称 user_v1,5个主分片,每个分片一个副本.分片基本都在11GB左右,GET _cat/shards/user 一共 ...

  4. ElasticSearch搜索介绍四

    ElasticSearch搜索 最基础的搜索: curl -XGET http://localhost:9200/_search 返回的结果为: { "took": 2, &quo ...

  5. elasticsearch indices.recovery 流程分析(索引的_open操作也会触发recovery)——主分片recovery主要是从translog里恢复之前未写完的index,副分片recovery主要是从主分片copy segment和translog来进行恢复

    摘自:https://www.easyice.cn/archives/231 elasticsearch indices.recovery 流程分析与速度优化 目录 [隐藏] 主分片恢复流程 副本分片 ...

  6. ElasticSearch 线程池类型分析之 ExecutorScalingQueue

    ElasticSearch 线程池类型分析之 ExecutorScalingQueue 在ElasticSearch 线程池类型分析之SizeBlockingQueue这篇文章中分析了ES的fixed ...

  7. ElasticSearch 线程池类型分析之 ResizableBlockingQueue

    ElasticSearch 线程池类型分析之 ResizableBlockingQueue 在上一篇文章 ElasticSearch 线程池类型分析之 ExecutorScalingQueue的末尾, ...

  8. Elasticsearch搜索资料汇总

    Elasticsearch 简介 Elasticsearch(ES)是一个基于Lucene 构建的开源分布式搜索分析引擎,可以近实时的索引.检索数据.具备高可靠.易使用.社区活跃等特点,在全文检索.日 ...

  9. 看完这篇还不会 Elasticsearch 搜索,那我就哭了!

    本文主要介绍 ElasticSearch 搜索相关的知识,首先会介绍下 URI Search 和 Request Body Search,同时也会学习什么是搜索的相关性,如何衡量相关性. Search ...

随机推荐

  1. Spring 容器可以在自动装配相互协作的 bean 之间的关系,使用autowire属性定义指定自动装配模式。

    使用setter方法java public class TextEditor { private SpellChecker spellChecker; public void setSpellChec ...

  2. 基于AGS JS开发自定义贴图图层

    文章版权由作者李晓晖和博客园共有,若转载请于明显处标明出处:http://www.cnblogs.com/naaoveGIS/ 1.前言 假设一个景区有多张图片需要在地图上展示,并且随着地图的缩放而缩 ...

  3. 构建微服务-使用OAuth 2.0保护API接口

    微服务操作模型 基于Spring Cloud和Netflix OSS 构建微服务-Part 1 基于Spring Cloud和Netflix OSS构建微服务,Part 2 在本文中,我们将使用OAu ...

  4. Spring Data JPA 实例查询

    一.相关接口方法     在继承JpaRepository接口后,自动拥有了按"实例"进行查询的诸多方法.这些方法主要在两个接口中定义,一是QueryByExampleExecut ...

  5. 百度推出 MIP Shell 链接

    在站长将站点 MIP 化时,需要关注 URL 的一共有三个:MIP URL, MIP-Cache URL 以及 MIP-Shell URL. 从 URL 说起 在互联网中,URL 定义页面的地址,每个 ...

  6. 查看某个ip地址接在交换机的哪个接口

    show ip interface brief 1.如果交换机上没有做VLAN 可以直接使用:show arp MPG3560#sh arp Protocol Address Age (min) Ha ...

  7. SpringMVC参数校验(针对`@RequestBody`返回`400`)

    SpringMVC参数校验(针对@RequestBody返回400) 前言 习惯别人帮忙做事的结果是自己不会做事了.一直以来,spring帮我解决了程序运行中的各种问题,我只要关心我的业务逻辑,设计好 ...

  8. Jquery EasyUI远程校验,Jquery EasyUI多个自定义校验,EasyUI自定义校验

    >>>>>>>>>>>>>>>>>>>>>>>>> ...

  9. 【C++】浅谈三大特性之一继承(一)

    一,为什么要引入继承? 继承是一个非常自然的概念,现实世界中的许多事物也都是具有继承性的. 例如,爸爸继承爷爷的特性,儿子又继承爸爸的特性等都属于继承的范畴.下面是一个简单的汽车分类图: 在这个分类图 ...

  10. XML和JSON两种数据交换格式的比较

    在web开发领域,主要的数据交换格式有XML和JSON,对于在 Ajax开发中,是选择XML还是JSON,一直存在着争议,个人还是比较倾向于JSON的.一般都输出Json不输出xml,原因就是因为 x ...