顾名思义,most_field就是匹配词干的字段数越多,分数越高,也可设置权重boost。

下面是简易公式(详细评分算法请参考:http://m.blog.csdn.net/article/details?id=50623948):

score=match_field1_score*boost+match_field2_score*boost+...match_fieldN_score*boost。

在很多情况下,这种搜索很有效,但存在一个弱点,就是当文档中的字段冗余信息过多,将会影响那些文档比较精炼,而且意思较为全面的分值,

不能使用operator和minimum_should_match来减少相关性低的doc的长尾问题,简单的来说就是按term匹配的个数取胜

例下:

搜索关键字“北京东路”,先下面的分词结果,我们知道它的词干为“北京”与“东路”:

curl   'localhost:9200/fullbiz_index/_analyze?analyzer=ik_smart&pretty=true' -d '{"text":"北京东路"}'
{
"tokens" : [
{
"token" : "text",
"start_offset" : 2,
"end_offset" : 6,
"type" : "ENGLISH",
"position" : 1
},
{
"token" : "北京",
"start_offset" : 9,
"end_offset" : 11,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "东路",
"start_offset" : 11,
"end_offset" : 13,
"type" : "CN_WORD",
"position" : 3
}
]
}
curl  'localhost:9200/fullbiz1/fullbizinfo/_search?pretty' -d '
{
"from" : 0,
"size" : 20,
"query" : {
"multi_match" : {
"query" : "北京东路",
"fields" : [ "title", "highlight", "tags", "address", "businessDistrict", "cuisineStyle" ],
"type" : "most_fields",
"minimum_should_match" : "70%",//这是指最少匹配词干占比,例如三个词干,只要配置了二个以上就算match,66.6%会啥入70%。二个词干或以下,只要匹配了一个就行。所以“北京东路”只要匹配了“北京”或“东路”都可得分
"analyzer" : "ik_smart" //ik有二种模式,一种是ik_max_word(最细词干法),ik_smart(最粗词干法),这里我们配置第二种,以更接近于业务结果。
}
},
"post_filter" : {
"bool" : {
"must" : [ {
"term" : {
"status" : 0
}
}, {
"term" : {
"hostDisplay" : 1
}
}, {
"term" : {
"cityId" : 2
}
}, {
"term" : {
"productType" : 3
}
} ]
}
}
}'
 
"hits" : [ {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "324239",
"_score" : 0.33371,
"_source":{"boost":1,"productId":24239,"productType":3,"subType":2,"title":"城市公牛(南京东路店)","viceTitle":"城市公牛(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"meal/2016/08/11/1470892987880.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":null,"status":0,"isFree":-1,"duration":"10:00:00-22:30:00","onlineTime":1470280723,"updateTime":1486951326,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":24239,"contactNumber":"13764741956","hostName":"城市公牛(南京东路店)","address":"南京东路300号L221-222室(河南中路口)","hostDisplay":1,"hostPicUrl":"meal/2016/08/11/1470892987880.jpg","hostSharePicUrl":"meal/2016/08/11/1470892987880.jpg","hostLatitude":"31.243455970586","hostLongitude":"121.49099099941","location":{"lat":"31.243455970586","lon":"121.49099099941"},"hostLatitudeGD":"31.237701","hostLongitudeGD":"121.484409","locationGD":{"lat":"31.237701","lon":"121.484409"},"headPics":"","catalogIds":null,"cuisineStyleId":41,"cuisineStyle":"西餐","hideMask":0,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":1,"orderNums":3,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":16000,"hostProductLabelIds":",1,2,4,5,7,8,9,12,13,14,15,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"2010年世博会加拿大馆特约餐厅\",\"加拿大简约西部乡村风格小酒馆餐厅\",\"家庭式的用餐氛围 80%均是外国食客\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-13T10:02:06.000+08:00"}
}, {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "392659",
"_score" : 0.31962717,
"_source":{"boost":1,"productId":92659,"productType":3,"subType":4,"title":"THAIBEAUTY美容连锁机构(南京东路店)","viceTitle":"THAIBEAUTY美容连锁机构(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2017/01/11/1484121279773528.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1484121281,"updateTime":1484202471,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":92659,"contactNumber":"021-63511876","hostName":"THAIBEAUTY美容连锁机构(南京东路店)","address":"南京东路580号6楼","hostDisplay":1,"hostPicUrl":"hostInfo/2017/01/11/1484121279773528.jpg","hostSharePicUrl":"hostInfo/2017/01/11/1484121279773528.jpg","hostLatitude":"31.241721400027","hostLongitude":"121.48585125776","location":{"lat":"31.241721400027","lon":"121.48585125776"},"hostLatitudeGD":"31.235887","hostLongitudeGD":"121.479289","locationGD":{"lat":"31.235887","lon":"121.479289"},"headPics":"","catalogIds":null,"cuisineStyleId":0,"cuisineStyle":"美容/SPA","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":284500,"hostProductLabelIds":",60,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"高端局部瘦身\",\"环境舒适 按摩师手法专业\",\"使用高品质产品\"]","isSeatBook":1,"lastUTCTimestamp":"2017-01-12T14:27:51.000+08:00"}
}, {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "364804",
"_score" : 0.31002828,
"_source":{"boost":1,"productId":64804,"productType":3,"subType":2,"title":"斗牛士(南京东路店)","viceTitle":"斗牛士(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2016/12/26/1482718008927949.png","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1482718014,"updateTime":1486569730,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":64804,"contactNumber":"021-33317136","hostName":"斗牛士(南京东路店)","address":"南京东路353号悦荟广场(原353店)7F","hostDisplay":1,"hostPicUrl":"hostInfo/2016/12/26/1482718008927949.png","hostSharePicUrl":"hostInfo/2016/12/26/1482718008927949.png","hostLatitude":"31.24210523683","hostLongitude":"121.49020262932","location":{"lat":"31.24210523683","lon":"121.49020262932"},"hostLatitudeGD":"31.236339","hostLongitudeGD":"121.483623","locationGD":{"lat":"31.236339","lon":"121.483623"},"headPics":"","catalogIds":null,"cuisineStyleId":41,"cuisineStyle":"西餐","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":12200,"hostProductLabelIds":",1,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"精选进口澳洲安格斯牛排\",\"严控0度低温 保证牛肉鲜嫩\",\"进口原切牛排保证牛肉口感与外观\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-09T00:02:10.000+08:00"}
.....
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "353771",
"_score" : 0.7784657,
"_source":{"boost":1,"productId":53771,"productType":3,"subType":2,"title":"九储堂创意中国菜(外滩店)","viceTitle":"九储堂创意中国菜(外滩店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2016/12/26/1482744127546461.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1482744132,"updateTime":1486738928,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"外滩","businessDistrictId":71,"hostId":53771,"contactNumber":"021-63308900","hostName":"九储堂创意中国菜(外滩店)","address":"北京东路398号新协通国际大酒店18楼","hostDisplay":1,"hostPicUrl":"hostInfo/2016/12/26/1482744127546461.jpg","hostSharePicUrl":"hostInfo/2016/12/26/1482744127546461.jpg","hostLatitude":"31.246247363994","hostLongitude":"121.48894308136","location":{"lat":"31.246247363994","lon":"121.48894308136"},"hostLatitudeGD":"31.240463","hostLongitudeGD":"121.48237","locationGD":{"lat":"31.240463","lon":"121.48237"},"headPics":"","catalogIds":null,"cuisineStyleId":25,"cuisineStyle":"创意菜","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":19100,"hostProductLabelIds":",1,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"新加坡同乐餐饮总厨胡于保先生主理\",\"大厅可容纳150人的宴会 包房5间\",\"靠窗座位亦可欣赏浦江两岸美景\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-10T23:02:08.000+08:00"}

而结果中有包含“北京东路”完整内容的文档却排在后面,这不科学,为什么会是这个结果,下面我们经过explain来看看评分计算:

curl  'localhost:9200/fullbiz1/fullbizinfo/_search?pretty&explain'  ....后面内容省略,和上面的请求是一样,只加了一个explain,以及size限制第一条,因为信息太多,只分析具体一个文档,下面我们直接看评分部分:

      "_explanation" : {
"value" : 0.33371,
"description" : "product of:",
"details" : [ {
"value" : 0.66742,
"description" : "sum of:",
"details" : [ {
"value" : 0.28481156,
"description" : "product of:",
"details" : [ {
"value" : 0.5696231,
"description" : "sum of:",
"details" : [ {
"value" : 0.5696231,
"description" : "weight(title:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.5696231,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.25448462,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 7.1626873,
"description" : "idf(docFreq=244, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 2.23834,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 7.1626873,
"description" : "idf(docFreq=244, maxDocs=116302)"
}, {
"value" : 0.3125,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
}, {
"value" : 0.067192085,
"description" : "product of:",
"details" : [ {
"value" : 0.13438417,
"description" : "sum of:",
"details" : [ {
"value" : 0.13438417,
"description" : "weight(address:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.13438417,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.1477382,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 4.158218,
"description" : "idf(docFreq=4942, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 0.90961015,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 4.158218,
"description" : "idf(docFreq=4942, maxDocs=116302)"
}, {
"value" : 0.21875,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
}, {
"value" : 0.3154164,
"description" : "product of:",
"details" : [ {
"value" : 0.6308328,
"description" : "sum of:",
"details" : [ {
"value" : 0.6308328,
"description" : "weight(businessDistrict:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.6308328,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.22633977,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 6.3705263,
"description" : "idf(docFreq=540, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 2.7871053,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 6.3705263,
"description" : "idf(docFreq=540, maxDocs=116302)"
}, {
"value" : 0.4375,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(3/6)"
} ]
}
} ]
}
}

从上面分析结果来看,排在前面的这些包含“南京东路”的文档,不是因为匹配度高,而是因为匹配的字段多,所以得分大于下面那个只包含一个“北京东路”字段的文档。

总结:most_field适应于那种字段之间信息差异较大的搜索匹配,像上面那种title中有“东路”,商圈、地址中也有“东路“,冗余信息较多。

Elasticsearch搜索之most_fields分析的更多相关文章

  1. Elasticsearch搜索之cross_fields分析

    cross_fields类型采用了一种以词条为中心(Term-centric)的方法,这种方法和best_fields及most_fields采用的以字段为中心(Field-centric)的方法有很 ...

  2. Elasticsearch搜索之best_fields分析

    顾名思义,best_field就是获取最佳匹配的field,另个可以通过tie_breaker来控制其他field的得分,boost可以设置权重(默认都为1). 下面从宏观上来讲的简单公式: scor ...

  3. 一次 ElasticSearch 搜索优化

    一次 ElasticSearch 搜索优化 1. 环境 ES6.3.2,索引名称 user_v1,5个主分片,每个分片一个副本.分片基本都在11GB左右,GET _cat/shards/user 一共 ...

  4. ElasticSearch搜索介绍四

    ElasticSearch搜索 最基础的搜索: curl -XGET http://localhost:9200/_search 返回的结果为: { "took": 2, &quo ...

  5. elasticsearch indices.recovery 流程分析(索引的_open操作也会触发recovery)——主分片recovery主要是从translog里恢复之前未写完的index,副分片recovery主要是从主分片copy segment和translog来进行恢复

    摘自:https://www.easyice.cn/archives/231 elasticsearch indices.recovery 流程分析与速度优化 目录 [隐藏] 主分片恢复流程 副本分片 ...

  6. ElasticSearch 线程池类型分析之 ExecutorScalingQueue

    ElasticSearch 线程池类型分析之 ExecutorScalingQueue 在ElasticSearch 线程池类型分析之SizeBlockingQueue这篇文章中分析了ES的fixed ...

  7. ElasticSearch 线程池类型分析之 ResizableBlockingQueue

    ElasticSearch 线程池类型分析之 ResizableBlockingQueue 在上一篇文章 ElasticSearch 线程池类型分析之 ExecutorScalingQueue的末尾, ...

  8. Elasticsearch搜索资料汇总

    Elasticsearch 简介 Elasticsearch(ES)是一个基于Lucene 构建的开源分布式搜索分析引擎,可以近实时的索引.检索数据.具备高可靠.易使用.社区活跃等特点,在全文检索.日 ...

  9. 看完这篇还不会 Elasticsearch 搜索,那我就哭了!

    本文主要介绍 ElasticSearch 搜索相关的知识,首先会介绍下 URI Search 和 Request Body Search,同时也会学习什么是搜索的相关性,如何衡量相关性. Search ...

随机推荐

  1. Java使用Schema模式对XML验证

    XML允许创作者定义自己的标签,因其灵活的特性让其难以编写和解析.因此必须使用某种模式来约束其结构.目前最流行的这种模式有两种:DTD和SCHEMA,而后者以其独特的优势即将取代DTD模式,目前只是过 ...

  2. div模拟table

    <!DOCTYPE html><html><head><meta charset="UTF-8"><title>div模 ...

  3. SpringBoot之旅 -- SpringBoot 项目健康检查与监控

    前言 You build it,You run it, 当我们编写的项目上线后,为了能第一时间知晓该项目是否出现问题,常常对项目进行健康检查及一些指标进行监控. Spring Boot-Actuato ...

  4. java基础:数组查询,同一数组一个元素最多出现两次

  5. 跨专业学习编程的苦逼生活 QWQ嘤嘤嘤

    一串串小小的代码,竟然可以做出辣么多的东西,彻底颠覆了我的世界观.人生观.价值观. 话不多说,一个例子证明一切>>>> <!DOCTYPE html> <ht ...

  6. 3223: Tyvj 1729 文艺平衡树

    3223: Tyvj 1729 文艺平衡树 Time Limit: 10 Sec  Memory Limit: 128 MBSubmit: 1347  Solved: 724[Submit][Stat ...

  7. iOS 常用公共方法

    iOS常用公共方法 1. 获取磁盘总空间大小 //磁盘总空间 + (CGFloat)diskOfAllSizeMBytes{ CGFloat size = 0.0; NSError *error; N ...

  8. 图文:eclipse中SVN分支合并到主干

    在项目开发中,需要添加一些新的功能,但是又不想影响到其他开发人员的项目进度,所以决定使用SVN分支进行开发,分支开发完毕后再合并到主干.本文介绍如何在eclipse中合并分支到主干. 1. 要想将分支 ...

  9. ASP.NET Core MVC之ViewComponents(视图组件)

    前言 大概一个来星期未更新博客了,久违了各位,关于SQL Server性能优化会和ASP.NET Core MVC穿插来讲,如果你希望我分享哪些内容可以在评论下方提出来,我会筛选并看看技术文档来对你的 ...

  10. 谈谈一些有趣的CSS题目(十三)-- 巧妙地制作背景色渐变动画!

    开本系列,谈谈一些有趣的 CSS 题目,题目类型天马行空,想到什么说什么,不仅为了拓宽一下解决问题的思路,更涉及一些容易忽视的 CSS 细节. 解题不考虑兼容性,题目天马行空,想到什么说什么,如果解题 ...