顾名思义,most_field就是匹配词干的字段数越多,分数越高,也可设置权重boost。

下面是简易公式(详细评分算法请参考:http://m.blog.csdn.net/article/details?id=50623948):

score=match_field1_score*boost+match_field2_score*boost+...match_fieldN_score*boost。

在很多情况下,这种搜索很有效,但存在一个弱点,就是当文档中的字段冗余信息过多,将会影响那些文档比较精炼,而且意思较为全面的分值,

不能使用operator和minimum_should_match来减少相关性低的doc的长尾问题,简单的来说就是按term匹配的个数取胜

例下:

搜索关键字“北京东路”,先下面的分词结果,我们知道它的词干为“北京”与“东路”:

curl   'localhost:9200/fullbiz_index/_analyze?analyzer=ik_smart&pretty=true' -d '{"text":"北京东路"}'
{
"tokens" : [
{
"token" : "text",
"start_offset" : 2,
"end_offset" : 6,
"type" : "ENGLISH",
"position" : 1
},
{
"token" : "北京",
"start_offset" : 9,
"end_offset" : 11,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "东路",
"start_offset" : 11,
"end_offset" : 13,
"type" : "CN_WORD",
"position" : 3
}
]
}
curl  'localhost:9200/fullbiz1/fullbizinfo/_search?pretty' -d '
{
"from" : 0,
"size" : 20,
"query" : {
"multi_match" : {
"query" : "北京东路",
"fields" : [ "title", "highlight", "tags", "address", "businessDistrict", "cuisineStyle" ],
"type" : "most_fields",
"minimum_should_match" : "70%",//这是指最少匹配词干占比,例如三个词干,只要配置了二个以上就算match,66.6%会啥入70%。二个词干或以下,只要匹配了一个就行。所以“北京东路”只要匹配了“北京”或“东路”都可得分
"analyzer" : "ik_smart" //ik有二种模式,一种是ik_max_word(最细词干法),ik_smart(最粗词干法),这里我们配置第二种,以更接近于业务结果。
}
},
"post_filter" : {
"bool" : {
"must" : [ {
"term" : {
"status" : 0
}
}, {
"term" : {
"hostDisplay" : 1
}
}, {
"term" : {
"cityId" : 2
}
}, {
"term" : {
"productType" : 3
}
} ]
}
}
}'
 
"hits" : [ {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "324239",
"_score" : 0.33371,
"_source":{"boost":1,"productId":24239,"productType":3,"subType":2,"title":"城市公牛(南京东路店)","viceTitle":"城市公牛(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"meal/2016/08/11/1470892987880.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":null,"status":0,"isFree":-1,"duration":"10:00:00-22:30:00","onlineTime":1470280723,"updateTime":1486951326,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":24239,"contactNumber":"13764741956","hostName":"城市公牛(南京东路店)","address":"南京东路300号L221-222室(河南中路口)","hostDisplay":1,"hostPicUrl":"meal/2016/08/11/1470892987880.jpg","hostSharePicUrl":"meal/2016/08/11/1470892987880.jpg","hostLatitude":"31.243455970586","hostLongitude":"121.49099099941","location":{"lat":"31.243455970586","lon":"121.49099099941"},"hostLatitudeGD":"31.237701","hostLongitudeGD":"121.484409","locationGD":{"lat":"31.237701","lon":"121.484409"},"headPics":"","catalogIds":null,"cuisineStyleId":41,"cuisineStyle":"西餐","hideMask":0,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":1,"orderNums":3,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":16000,"hostProductLabelIds":",1,2,4,5,7,8,9,12,13,14,15,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"2010年世博会加拿大馆特约餐厅\",\"加拿大简约西部乡村风格小酒馆餐厅\",\"家庭式的用餐氛围 80%均是外国食客\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-13T10:02:06.000+08:00"}
}, {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "392659",
"_score" : 0.31962717,
"_source":{"boost":1,"productId":92659,"productType":3,"subType":4,"title":"THAIBEAUTY美容连锁机构(南京东路店)","viceTitle":"THAIBEAUTY美容连锁机构(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2017/01/11/1484121279773528.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1484121281,"updateTime":1484202471,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":92659,"contactNumber":"021-63511876","hostName":"THAIBEAUTY美容连锁机构(南京东路店)","address":"南京东路580号6楼","hostDisplay":1,"hostPicUrl":"hostInfo/2017/01/11/1484121279773528.jpg","hostSharePicUrl":"hostInfo/2017/01/11/1484121279773528.jpg","hostLatitude":"31.241721400027","hostLongitude":"121.48585125776","location":{"lat":"31.241721400027","lon":"121.48585125776"},"hostLatitudeGD":"31.235887","hostLongitudeGD":"121.479289","locationGD":{"lat":"31.235887","lon":"121.479289"},"headPics":"","catalogIds":null,"cuisineStyleId":0,"cuisineStyle":"美容/SPA","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":284500,"hostProductLabelIds":",60,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"高端局部瘦身\",\"环境舒适 按摩师手法专业\",\"使用高品质产品\"]","isSeatBook":1,"lastUTCTimestamp":"2017-01-12T14:27:51.000+08:00"}
}, {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "364804",
"_score" : 0.31002828,
"_source":{"boost":1,"productId":64804,"productType":3,"subType":2,"title":"斗牛士(南京东路店)","viceTitle":"斗牛士(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2016/12/26/1482718008927949.png","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1482718014,"updateTime":1486569730,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":64804,"contactNumber":"021-33317136","hostName":"斗牛士(南京东路店)","address":"南京东路353号悦荟广场(原353店)7F","hostDisplay":1,"hostPicUrl":"hostInfo/2016/12/26/1482718008927949.png","hostSharePicUrl":"hostInfo/2016/12/26/1482718008927949.png","hostLatitude":"31.24210523683","hostLongitude":"121.49020262932","location":{"lat":"31.24210523683","lon":"121.49020262932"},"hostLatitudeGD":"31.236339","hostLongitudeGD":"121.483623","locationGD":{"lat":"31.236339","lon":"121.483623"},"headPics":"","catalogIds":null,"cuisineStyleId":41,"cuisineStyle":"西餐","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":12200,"hostProductLabelIds":",1,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"精选进口澳洲安格斯牛排\",\"严控0度低温 保证牛肉鲜嫩\",\"进口原切牛排保证牛肉口感与外观\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-09T00:02:10.000+08:00"}
.....
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "353771",
"_score" : 0.7784657,
"_source":{"boost":1,"productId":53771,"productType":3,"subType":2,"title":"九储堂创意中国菜(外滩店)","viceTitle":"九储堂创意中国菜(外滩店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2016/12/26/1482744127546461.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1482744132,"updateTime":1486738928,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"外滩","businessDistrictId":71,"hostId":53771,"contactNumber":"021-63308900","hostName":"九储堂创意中国菜(外滩店)","address":"北京东路398号新协通国际大酒店18楼","hostDisplay":1,"hostPicUrl":"hostInfo/2016/12/26/1482744127546461.jpg","hostSharePicUrl":"hostInfo/2016/12/26/1482744127546461.jpg","hostLatitude":"31.246247363994","hostLongitude":"121.48894308136","location":{"lat":"31.246247363994","lon":"121.48894308136"},"hostLatitudeGD":"31.240463","hostLongitudeGD":"121.48237","locationGD":{"lat":"31.240463","lon":"121.48237"},"headPics":"","catalogIds":null,"cuisineStyleId":25,"cuisineStyle":"创意菜","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":19100,"hostProductLabelIds":",1,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"新加坡同乐餐饮总厨胡于保先生主理\",\"大厅可容纳150人的宴会 包房5间\",\"靠窗座位亦可欣赏浦江两岸美景\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-10T23:02:08.000+08:00"}

而结果中有包含“北京东路”完整内容的文档却排在后面,这不科学,为什么会是这个结果,下面我们经过explain来看看评分计算:

curl  'localhost:9200/fullbiz1/fullbizinfo/_search?pretty&explain'  ....后面内容省略,和上面的请求是一样,只加了一个explain,以及size限制第一条,因为信息太多,只分析具体一个文档,下面我们直接看评分部分:

      "_explanation" : {
"value" : 0.33371,
"description" : "product of:",
"details" : [ {
"value" : 0.66742,
"description" : "sum of:",
"details" : [ {
"value" : 0.28481156,
"description" : "product of:",
"details" : [ {
"value" : 0.5696231,
"description" : "sum of:",
"details" : [ {
"value" : 0.5696231,
"description" : "weight(title:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.5696231,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.25448462,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 7.1626873,
"description" : "idf(docFreq=244, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 2.23834,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 7.1626873,
"description" : "idf(docFreq=244, maxDocs=116302)"
}, {
"value" : 0.3125,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
}, {
"value" : 0.067192085,
"description" : "product of:",
"details" : [ {
"value" : 0.13438417,
"description" : "sum of:",
"details" : [ {
"value" : 0.13438417,
"description" : "weight(address:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.13438417,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.1477382,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 4.158218,
"description" : "idf(docFreq=4942, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 0.90961015,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 4.158218,
"description" : "idf(docFreq=4942, maxDocs=116302)"
}, {
"value" : 0.21875,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
}, {
"value" : 0.3154164,
"description" : "product of:",
"details" : [ {
"value" : 0.6308328,
"description" : "sum of:",
"details" : [ {
"value" : 0.6308328,
"description" : "weight(businessDistrict:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.6308328,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.22633977,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 6.3705263,
"description" : "idf(docFreq=540, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 2.7871053,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 6.3705263,
"description" : "idf(docFreq=540, maxDocs=116302)"
}, {
"value" : 0.4375,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(3/6)"
} ]
}
} ]
}
}

从上面分析结果来看,排在前面的这些包含“南京东路”的文档,不是因为匹配度高,而是因为匹配的字段多,所以得分大于下面那个只包含一个“北京东路”字段的文档。

总结:most_field适应于那种字段之间信息差异较大的搜索匹配,像上面那种title中有“东路”,商圈、地址中也有“东路“,冗余信息较多。

Elasticsearch搜索之most_fields分析的更多相关文章

  1. Elasticsearch搜索之cross_fields分析

    cross_fields类型采用了一种以词条为中心(Term-centric)的方法,这种方法和best_fields及most_fields采用的以字段为中心(Field-centric)的方法有很 ...

  2. Elasticsearch搜索之best_fields分析

    顾名思义,best_field就是获取最佳匹配的field,另个可以通过tie_breaker来控制其他field的得分,boost可以设置权重(默认都为1). 下面从宏观上来讲的简单公式: scor ...

  3. 一次 ElasticSearch 搜索优化

    一次 ElasticSearch 搜索优化 1. 环境 ES6.3.2,索引名称 user_v1,5个主分片,每个分片一个副本.分片基本都在11GB左右,GET _cat/shards/user 一共 ...

  4. ElasticSearch搜索介绍四

    ElasticSearch搜索 最基础的搜索: curl -XGET http://localhost:9200/_search 返回的结果为: { "took": 2, &quo ...

  5. elasticsearch indices.recovery 流程分析(索引的_open操作也会触发recovery)——主分片recovery主要是从translog里恢复之前未写完的index,副分片recovery主要是从主分片copy segment和translog来进行恢复

    摘自:https://www.easyice.cn/archives/231 elasticsearch indices.recovery 流程分析与速度优化 目录 [隐藏] 主分片恢复流程 副本分片 ...

  6. ElasticSearch 线程池类型分析之 ExecutorScalingQueue

    ElasticSearch 线程池类型分析之 ExecutorScalingQueue 在ElasticSearch 线程池类型分析之SizeBlockingQueue这篇文章中分析了ES的fixed ...

  7. ElasticSearch 线程池类型分析之 ResizableBlockingQueue

    ElasticSearch 线程池类型分析之 ResizableBlockingQueue 在上一篇文章 ElasticSearch 线程池类型分析之 ExecutorScalingQueue的末尾, ...

  8. Elasticsearch搜索资料汇总

    Elasticsearch 简介 Elasticsearch(ES)是一个基于Lucene 构建的开源分布式搜索分析引擎,可以近实时的索引.检索数据.具备高可靠.易使用.社区活跃等特点,在全文检索.日 ...

  9. 看完这篇还不会 Elasticsearch 搜索,那我就哭了!

    本文主要介绍 ElasticSearch 搜索相关的知识,首先会介绍下 URI Search 和 Request Body Search,同时也会学习什么是搜索的相关性,如何衡量相关性. Search ...

随机推荐

  1. centOS7 mini配置linux服务器(三) 配置防火墙以及IPtables切换

    一.firewall介绍 CentOS 7中防火墙是一个非常的强大的功能,在CentOS 6.5中在iptables防火墙中进行了升级了. 1.官方介绍 The dynamic firewall da ...

  2. Android学习总结(十二)———— BaseAdapter优化

    一.BaseAdapter的基本概念 对于Android程序员来说,BaseAdapter肯定不会陌生,灵活而优雅是BaseAdapter最大的特点.开发者可以通过构造BaseAdapter并搭载到L ...

  3. 2015年ACM-ICPC亚洲区域赛合肥站网络预选赛H题——The Next (位运算)

    Let L denote the number of 1s in integer D's binary representation. Given two integers S1 and S2, we ...

  4. 长连接 Socket.IO

    概念 说到长连接,对应的就是短连接了.下面先说明一下长连接和短连接的区别: 短连接与长连接 通俗来讲,浏览器和服务器每进行一次通信,就建立一次连接,任务结束就中断连接,即短连接.相反地,假如通信结束( ...

  5. 使用IDEA的gradle整合spring+ mybatis 采用javaconfig配置

    1.新建一个工程 2.工程目录 3.添加gradle.propertes文件 activeMQVersion=5.7.0 aspectJVersion=1.7.2 commonsLangVersion ...

  6. [lua] mac上如何编译snapshot(检测Lua中的内存泄露)

    最近我们的unity手游频繁闪退,只要进入战斗场景,之后一段时间就会闪退,如果是在unity编辑器中则会报出not enough memory的错误!猜测应该是有内存泄漏: 由于我们使用了tolua, ...

  7. iOS UI控件总结(全)

    1.UIButton UIButton *btn = [UIButton buttonWithType:UIButtonTypeRoundedRect]; btn.frame = CGRectMake ...

  8. python 接口自动化测试(一)

    一.测试需求描述 对服务后台一系列SOAP接口功能测试 参数传入:根据接口描述构造不同的参数输入值(Json格式) 二.程序设计 通过Excel配置具体的测试用例数据 保存参数为Json格式,预写入预 ...

  9. OpenCV使用FindContours进行二维码定位

    我使用过FindContours,而且知道有能够直接寻找联通区域的函数.但是我使用的大多只是"最大轮廓"或者"轮廓数目"这些数据.其实轮廓还有另一个很重要的性质 ...

  10. fastjson将json格式null转化空串

    生成JSON代码片段 Map < String , Object > jsonMap = new HashMap< String , Object>(); jsonMap.pu ...