顾名思义,most_field就是匹配词干的字段数越多,分数越高,也可设置权重boost。

下面是简易公式(详细评分算法请参考:http://m.blog.csdn.net/article/details?id=50623948):

score=match_field1_score*boost+match_field2_score*boost+...match_fieldN_score*boost。

在很多情况下,这种搜索很有效,但存在一个弱点,就是当文档中的字段冗余信息过多,将会影响那些文档比较精炼,而且意思较为全面的分值,

不能使用operator和minimum_should_match来减少相关性低的doc的长尾问题,简单的来说就是按term匹配的个数取胜

例下:

搜索关键字“北京东路”,先下面的分词结果,我们知道它的词干为“北京”与“东路”:

curl   'localhost:9200/fullbiz_index/_analyze?analyzer=ik_smart&pretty=true' -d '{"text":"北京东路"}'
{
"tokens" : [
{
"token" : "text",
"start_offset" : 2,
"end_offset" : 6,
"type" : "ENGLISH",
"position" : 1
},
{
"token" : "北京",
"start_offset" : 9,
"end_offset" : 11,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "东路",
"start_offset" : 11,
"end_offset" : 13,
"type" : "CN_WORD",
"position" : 3
}
]
}
curl  'localhost:9200/fullbiz1/fullbizinfo/_search?pretty' -d '
{
"from" : 0,
"size" : 20,
"query" : {
"multi_match" : {
"query" : "北京东路",
"fields" : [ "title", "highlight", "tags", "address", "businessDistrict", "cuisineStyle" ],
"type" : "most_fields",
"minimum_should_match" : "70%",//这是指最少匹配词干占比,例如三个词干,只要配置了二个以上就算match,66.6%会啥入70%。二个词干或以下,只要匹配了一个就行。所以“北京东路”只要匹配了“北京”或“东路”都可得分
"analyzer" : "ik_smart" //ik有二种模式,一种是ik_max_word(最细词干法),ik_smart(最粗词干法),这里我们配置第二种,以更接近于业务结果。
}
},
"post_filter" : {
"bool" : {
"must" : [ {
"term" : {
"status" : 0
}
}, {
"term" : {
"hostDisplay" : 1
}
}, {
"term" : {
"cityId" : 2
}
}, {
"term" : {
"productType" : 3
}
} ]
}
}
}'
 
"hits" : [ {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "324239",
"_score" : 0.33371,
"_source":{"boost":1,"productId":24239,"productType":3,"subType":2,"title":"城市公牛(南京东路店)","viceTitle":"城市公牛(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"meal/2016/08/11/1470892987880.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":null,"status":0,"isFree":-1,"duration":"10:00:00-22:30:00","onlineTime":1470280723,"updateTime":1486951326,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":24239,"contactNumber":"13764741956","hostName":"城市公牛(南京东路店)","address":"南京东路300号L221-222室(河南中路口)","hostDisplay":1,"hostPicUrl":"meal/2016/08/11/1470892987880.jpg","hostSharePicUrl":"meal/2016/08/11/1470892987880.jpg","hostLatitude":"31.243455970586","hostLongitude":"121.49099099941","location":{"lat":"31.243455970586","lon":"121.49099099941"},"hostLatitudeGD":"31.237701","hostLongitudeGD":"121.484409","locationGD":{"lat":"31.237701","lon":"121.484409"},"headPics":"","catalogIds":null,"cuisineStyleId":41,"cuisineStyle":"西餐","hideMask":0,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":1,"orderNums":3,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":16000,"hostProductLabelIds":",1,2,4,5,7,8,9,12,13,14,15,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"2010年世博会加拿大馆特约餐厅\",\"加拿大简约西部乡村风格小酒馆餐厅\",\"家庭式的用餐氛围 80%均是外国食客\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-13T10:02:06.000+08:00"}
}, {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "392659",
"_score" : 0.31962717,
"_source":{"boost":1,"productId":92659,"productType":3,"subType":4,"title":"THAIBEAUTY美容连锁机构(南京东路店)","viceTitle":"THAIBEAUTY美容连锁机构(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2017/01/11/1484121279773528.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1484121281,"updateTime":1484202471,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":92659,"contactNumber":"021-63511876","hostName":"THAIBEAUTY美容连锁机构(南京东路店)","address":"南京东路580号6楼","hostDisplay":1,"hostPicUrl":"hostInfo/2017/01/11/1484121279773528.jpg","hostSharePicUrl":"hostInfo/2017/01/11/1484121279773528.jpg","hostLatitude":"31.241721400027","hostLongitude":"121.48585125776","location":{"lat":"31.241721400027","lon":"121.48585125776"},"hostLatitudeGD":"31.235887","hostLongitudeGD":"121.479289","locationGD":{"lat":"31.235887","lon":"121.479289"},"headPics":"","catalogIds":null,"cuisineStyleId":0,"cuisineStyle":"美容/SPA","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":284500,"hostProductLabelIds":",60,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"高端局部瘦身\",\"环境舒适 按摩师手法专业\",\"使用高品质产品\"]","isSeatBook":1,"lastUTCTimestamp":"2017-01-12T14:27:51.000+08:00"}
}, {
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "364804",
"_score" : 0.31002828,
"_source":{"boost":1,"productId":64804,"productType":3,"subType":2,"title":"斗牛士(南京东路店)","viceTitle":"斗牛士(南京东路店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2016/12/26/1482718008927949.png","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1482718014,"updateTime":1486569730,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"南京东路","businessDistrictId":73,"hostId":64804,"contactNumber":"021-33317136","hostName":"斗牛士(南京东路店)","address":"南京东路353号悦荟广场(原353店)7F","hostDisplay":1,"hostPicUrl":"hostInfo/2016/12/26/1482718008927949.png","hostSharePicUrl":"hostInfo/2016/12/26/1482718008927949.png","hostLatitude":"31.24210523683","hostLongitude":"121.49020262932","location":{"lat":"31.24210523683","lon":"121.49020262932"},"hostLatitudeGD":"31.236339","hostLongitudeGD":"121.483623","locationGD":{"lat":"31.236339","lon":"121.483623"},"headPics":"","catalogIds":null,"cuisineStyleId":41,"cuisineStyle":"西餐","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":12200,"hostProductLabelIds":",1,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"精选进口澳洲安格斯牛排\",\"严控0度低温 保证牛肉鲜嫩\",\"进口原切牛排保证牛肉口感与外观\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-09T00:02:10.000+08:00"}
.....
"_index" : "fullbiz1",
"_type" : "fullbizinfo",
"_id" : "353771",
"_score" : 0.7784657,
"_source":{"boost":1,"productId":53771,"productType":3,"subType":2,"title":"九储堂创意中国菜(外滩店)","viceTitle":"九储堂创意中国菜(外滩店)","personMax":"-1","personMin":"-1","picUrl":"hostInfo/2016/12/26/1482744127546461.jpg","recommand":-1,"needReserveTime":-1,"priceStr":"-1","price":"-1","originalPrice":"-1","leadingMinutes":-1,"tags":"","status":0,"isFree":-1,"duration":null,"onlineTime":1482744132,"updateTime":1486738928,"applyExpiredTime":0,"beginTime":0,"endTime":0,"isCourse":-1,"isTour":-1,"supportParty":0,"interestedNum":0,"cityId":2,"cityName":"上海","categoryId":"0","categoryName":"","categoryIconUrl":"","businessDistrict":"外滩","businessDistrictId":71,"hostId":53771,"contactNumber":"021-63308900","hostName":"九储堂创意中国菜(外滩店)","address":"北京东路398号新协通国际大酒店18楼","hostDisplay":1,"hostPicUrl":"hostInfo/2016/12/26/1482744127546461.jpg","hostSharePicUrl":"hostInfo/2016/12/26/1482744127546461.jpg","hostLatitude":"31.246247363994","hostLongitude":"121.48894308136","location":{"lat":"31.246247363994","lon":"121.48894308136"},"hostLatitudeGD":"31.240463","hostLongitudeGD":"121.48237","locationGD":{"lat":"31.240463","lon":"121.48237"},"headPics":"","catalogIds":null,"cuisineStyleId":25,"cuisineStyle":"创意菜","hideMask":-1,"referenceAgeMin":0,"referenceAgeMax":0,"userLimit":-1,"todayReservable":0,"orderNums":0,"pvConversionRate":"-1","interestNums":0,"hotPoints":0,"hostAvgPrice":19100,"hostProductLabelIds":",1,","shopPay":0,"hostVipEquities":"0","isHostSale":0,"highlight":"[\"新加坡同乐餐饮总厨胡于保先生主理\",\"大厅可容纳150人的宴会 包房5间\",\"靠窗座位亦可欣赏浦江两岸美景\"]","isSeatBook":1,"lastUTCTimestamp":"2017-02-10T23:02:08.000+08:00"}

而结果中有包含“北京东路”完整内容的文档却排在后面,这不科学,为什么会是这个结果,下面我们经过explain来看看评分计算:

curl  'localhost:9200/fullbiz1/fullbizinfo/_search?pretty&explain'  ....后面内容省略,和上面的请求是一样,只加了一个explain,以及size限制第一条,因为信息太多,只分析具体一个文档,下面我们直接看评分部分:

      "_explanation" : {
"value" : 0.33371,
"description" : "product of:",
"details" : [ {
"value" : 0.66742,
"description" : "sum of:",
"details" : [ {
"value" : 0.28481156,
"description" : "product of:",
"details" : [ {
"value" : 0.5696231,
"description" : "sum of:",
"details" : [ {
"value" : 0.5696231,
"description" : "weight(title:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.5696231,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.25448462,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 7.1626873,
"description" : "idf(docFreq=244, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 2.23834,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 7.1626873,
"description" : "idf(docFreq=244, maxDocs=116302)"
}, {
"value" : 0.3125,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
}, {
"value" : 0.067192085,
"description" : "product of:",
"details" : [ {
"value" : 0.13438417,
"description" : "sum of:",
"details" : [ {
"value" : 0.13438417,
"description" : "weight(address:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.13438417,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.1477382,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 4.158218,
"description" : "idf(docFreq=4942, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 0.90961015,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 4.158218,
"description" : "idf(docFreq=4942, maxDocs=116302)"
}, {
"value" : 0.21875,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
}, {
"value" : 0.3154164,
"description" : "product of:",
"details" : [ {
"value" : 0.6308328,
"description" : "sum of:",
"details" : [ {
"value" : 0.6308328,
"description" : "weight(businessDistrict:东路 in 7321) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.6308328,
"description" : "score(doc=7321,freq=1.0), product of:",
"details" : [ {
"value" : 0.22633977,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 6.3705263,
"description" : "idf(docFreq=540, maxDocs=116302)"
}, {
"value" : 0.03552921,
"description" : "queryNorm"
} ]
}, {
"value" : 2.7871053,
"description" : "fieldWeight in 7321, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 6.3705263,
"description" : "idf(docFreq=540, maxDocs=116302)"
}, {
"value" : 0.4375,
"description" : "fieldNorm(doc=7321)"
} ]
} ]
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(1/2)"
} ]
} ]
}, {
"value" : 0.5,
"description" : "coord(3/6)"
} ]
}
} ]
}
}

从上面分析结果来看,排在前面的这些包含“南京东路”的文档,不是因为匹配度高,而是因为匹配的字段多,所以得分大于下面那个只包含一个“北京东路”字段的文档。

总结:most_field适应于那种字段之间信息差异较大的搜索匹配,像上面那种title中有“东路”,商圈、地址中也有“东路“,冗余信息较多。

Elasticsearch搜索之most_fields分析的更多相关文章

  1. Elasticsearch搜索之cross_fields分析

    cross_fields类型采用了一种以词条为中心(Term-centric)的方法,这种方法和best_fields及most_fields采用的以字段为中心(Field-centric)的方法有很 ...

  2. Elasticsearch搜索之best_fields分析

    顾名思义,best_field就是获取最佳匹配的field,另个可以通过tie_breaker来控制其他field的得分,boost可以设置权重(默认都为1). 下面从宏观上来讲的简单公式: scor ...

  3. 一次 ElasticSearch 搜索优化

    一次 ElasticSearch 搜索优化 1. 环境 ES6.3.2,索引名称 user_v1,5个主分片,每个分片一个副本.分片基本都在11GB左右,GET _cat/shards/user 一共 ...

  4. ElasticSearch搜索介绍四

    ElasticSearch搜索 最基础的搜索: curl -XGET http://localhost:9200/_search 返回的结果为: { "took": 2, &quo ...

  5. elasticsearch indices.recovery 流程分析(索引的_open操作也会触发recovery)——主分片recovery主要是从translog里恢复之前未写完的index,副分片recovery主要是从主分片copy segment和translog来进行恢复

    摘自:https://www.easyice.cn/archives/231 elasticsearch indices.recovery 流程分析与速度优化 目录 [隐藏] 主分片恢复流程 副本分片 ...

  6. ElasticSearch 线程池类型分析之 ExecutorScalingQueue

    ElasticSearch 线程池类型分析之 ExecutorScalingQueue 在ElasticSearch 线程池类型分析之SizeBlockingQueue这篇文章中分析了ES的fixed ...

  7. ElasticSearch 线程池类型分析之 ResizableBlockingQueue

    ElasticSearch 线程池类型分析之 ResizableBlockingQueue 在上一篇文章 ElasticSearch 线程池类型分析之 ExecutorScalingQueue的末尾, ...

  8. Elasticsearch搜索资料汇总

    Elasticsearch 简介 Elasticsearch(ES)是一个基于Lucene 构建的开源分布式搜索分析引擎,可以近实时的索引.检索数据.具备高可靠.易使用.社区活跃等特点,在全文检索.日 ...

  9. 看完这篇还不会 Elasticsearch 搜索,那我就哭了!

    本文主要介绍 ElasticSearch 搜索相关的知识,首先会介绍下 URI Search 和 Request Body Search,同时也会学习什么是搜索的相关性,如何衡量相关性. Search ...

随机推荐

  1. Array对象方法属性总结

    属性主要有三个:constructor;length;prototype; constructor(英文意思:构造器):返回对创建此对象的数组函数的引用.例如:var arr=new Array(); ...

  2. 黑苹果引导工具 Clover 配置详解及Clover Configurator使用

    黑苹果引导工具 Clover 配置详解及Clover Configurator使用  2017-03-11 14:01:40 by SemiconductorKING 转自:@三个表哥   简介: 可 ...

  3. Macaca 自动化框架 [Python 系列]

    介绍 Macaca是一套完整的自动化测试解决方案,基于node.js开发.由阿里巴巴公司开源: 地址:http://macacajs.github.io/macaca/ 特点: 同时支持PC端和移动端 ...

  4. node c++多线程插件 第二天 c++指针

    虽然取名叫node多线程插件,但是目前还是在学习c++的情况. 今天谈一谈c++指针. c++指针就像是c#中的引用变量,例如一个Person类的实例zs{Name="张三",Ag ...

  5. 非服务器的定期校正时间 Anacron

    与服务器不同,编程和办公用计算机不是连续24小时运行的.开关机的时间不固定,类似较时这样的任务无法保证运行. 对于这类机器,可以考虑使用 Anacron 进行设置. 在 Archlinux 中, An ...

  6. 学习ASP.NET MVC(十一)——分页

    在这一篇文章中,我们将学习如何在MVC页面中实现分页的方法.分页功能是一个非常实用,常用的功能,当数据量过多的时候,必然要使用分页.在今天这篇文章中,我们学习如果在MVC页面中使用PagedList. ...

  7. matplotlib根据Y轴数量伸缩画图的py脚本

    #coding:utf-8import numpy as npimport matplotlib.pyplot as plt #X,Y轴数据y = [20,59,11,12,16,20,15,12,1 ...

  8. webpack学习笔记(二)-- 初学者常见问题及解决方法

    这篇文章是webpack学习第二篇,主要罗列了本人在实际操作中遇到的一些问题及其解决方法,仅供参考,欢迎提出不同意见. 注:本文假设读者已有webpack方面相关知识,故文中涉及到的专有名词不做另外解 ...

  9. java学习笔记 --- 集合

    1.定义:集合是一种容器,专门用来存储对象 数组和集合的区别?   A:长度区别  数组的长度固定 集合长度可变 B:内容不同  数组存储的是同一种类型的元素  而集合可以存储不同类型的元素  C:元 ...

  10. [Openfire]使用WebSocket建立Openfire的客户端

    近日工作闲暇之余,对IM系统产生了兴趣,转而研究了IM的内容.找了半天,知道比较流行的是Openfire的系统,Openfire有许多平台实现,由于我是做Web的,所以当然是希望寻找Web的实现.Op ...