ElasticSearch中文分词(IK)
1、ElasticSearch官方分词
{
"tokens": [
{
"token": "岁",
"start_offset": 0,
"end_offset": 1,
"type": "<IDEOGRAPHIC>",
"position": 0
},
{
"token": "月",
"start_offset": 1,
"end_offset": 2,
"type": "<IDEOGRAPHIC>",
"position": 1
},
{
"token": "如",
"start_offset": 2,
"end_offset": 3,
"type": "<IDEOGRAPHIC>",
"position": 2
},
{
"token": "梭",
"start_offset": 3,
"end_offset": 4,
"type": "<IDEOGRAPHIC>",
"position": 3
}
]
}
{
"tokens": [
{
"token": "i",
"start_offset": 0,
"end_offset": 1,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "am",
"start_offset": 2,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "an",
"start_offset": 5,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "enginner",
"start_offset": 8,
"end_offset": 16,
"type": "<ALPHANUM>",
"position": 3
}
]
}
export PATH=$PATH:$MAVEN_HOME/bin
"tokens": [
{
"token": "岁月如梭",
"start_offset": 0,
"end_offset": 4,
"type": "CN_WORD",
"position": 0
},
{
"token": "岁月",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 1
},
{
"token": "如梭",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 2
},
{
"token": "梭",
"start_offset": 3,
"end_offset": 4,
"type": "CN_WORD",
"position": 3
}
]
}
"tokens": [
{
"token": "elasticsearch",
"start_offset": 0,
"end_offset": 13,
"type": "CN_WORD",
"position": 0
},
{
"token": "elastic",
"start_offset": 0,
"end_offset": 7,
"type": "CN_WORD",
"position": 1
},
{
"token": "很受",
"start_offset": 13,
"end_offset": 15,
"type": "CN_WORD",
"position": 2
},
{
"token": "受欢迎",
"start_offset": 14,
"end_offset": 17,
"type": "CN_WORD",
"position": 3
},
{
"token": "欢迎",
"start_offset": 15,
"end_offset": 17,
"type": "CN_WORD",
"position": 4
},
{
"token": "一款",
"start_offset": 19,
"end_offset": 21,
"type": "CN_WORD",
"position": 5
},
{
"token": "一",
"start_offset": 19,
"end_offset": 20,
"type": "TYPE_CNUM",
"position": 6
},
{
"token": "款",
"start_offset": 20,
"end_offset": 21,
"type": "COUNT",
"position": 7
},
{
"token": "拥有",
"start_offset": 21,
"end_offset": 23,
"type": "CN_WORD",
"position": 8
},
{
"token": "拥",
"start_offset": 21,
"end_offset": 22,
"type": "CN_WORD",
"position": 9
},
{
"token": "有",
"start_offset": 22,
"end_offset": 23,
"type": "CN_CHAR",
"position": 10
},
{
"token": "活跃",
"start_offset": 23,
"end_offset": 25,
"type": "CN_WORD",
"position": 11
},
{
"token": "跃",
"start_offset": 24,
"end_offset": 25,
"type": "CN_WORD",
"position": 12
},
{
"token": "社区",
"start_offset": 25,
"end_offset": 27,
"type": "CN_WORD",
"position": 13
},
{
"token": "开源",
"start_offset": 27,
"end_offset": 29,
"type": "CN_WORD",
"position": 14
},
{
"token": "搜索",
"start_offset": 30,
"end_offset": 32,
"type": "CN_WORD",
"position": 15
},
{
"token": "索解",
"start_offset": 31,
"end_offset": 33,
"type": "CN_WORD",
"position": 16
},
{
"token": "索",
"start_offset": 31,
"end_offset": 32,
"type": "CN_WORD",
"position": 17
},
{
"token": "解决方案",
"start_offset": 32,
"end_offset": 36,
"type": "CN_WORD",
"position": 18
},
{
"token": "解决",
"start_offset": 32,
"end_offset": 34,
"type": "CN_WORD",
"position": 19
},
{
"token": "方案",
"start_offset": 34,
"end_offset": 36,
"type": "CN_WORD",
"position": 20
}
]
}
ElasticSearch中文分词(IK)的更多相关文章
- java中调用ElasticSearch中文分词ik没有起作用
问题描述: 项目中已经将'齐鲁壹点'加入到扩展词中,但是使用客户端调用的时候,高亮显示还是按照单个文字分词的: 解决方案: 1.创建Mapping使用的分词使用ik 2.查询使用QueryBuilde ...
- Elasticsearch 中文分词(elasticsearch-analysis-ik) 安装
由于elasticsearch基于lucene,所以天然地就多了许多lucene上的中文分词的支持,比如 IK, Paoding, MMSEG4J等lucene中文分词原理上都能在elasticsea ...
- ES5中文分词(IK)
ElasticSearch5中文分词(IK) ElasticSearch安装 官网:https://www.elastic.co 1.ElasticSearch安装 1.1.下载安装公共密钥 rpm ...
- elasticsearch 中文分词(elasticsearch-analysis-ik)安装
elasticsearch 中文分词(elasticsearch-analysis-ik)安装 下载最新的发布版本 https://github.com/medcl/elasticsearch-ana ...
- ElasticSearch(三) ElasticSearch中文分词插件IK的安装
正因为Elasticsearch 内置的分词器对中文不友好,会把中文分成单个字来进行全文检索,所以我们需要借助中文分词插件来解决这个问题. 一.安装maven管理工具 Elasticsearch 要使 ...
- ElasticSearch 中文分词插件ik 的使用
下载 IK 的版本要与 Elasticsearch 的版本一致,因此下载 7.1.0 版本. 安装 1.中文分词插件下载地址:https://github.com/medcl/elasticsearc ...
- elasticsearch中文分词器(ik)配置
elasticsearch默认的分词:http://localhost:9200/userinfo/_analyze?analyzer=standard&pretty=true&tex ...
- ElasticSearch中文分词器-IK分词器的使用
IK分词器的使用 首先我们通过Postman发送GET请求查询分词效果 GET http://localhost:9200/_analyze { "text":"农业银行 ...
- ElasticSearch5中文分词(IK)
ElasticSearch安装 官网:https://www.elastic.co 1.ElasticSearch安装 1.1.下载安装公共密钥 rpm --import https://artifa ...
随机推荐
- HDU 3336 (KMP next性质) Count the string
直接上传送门好了,我觉得他分析得非常透彻. http://972169909-qq-com.iteye.com/blog/1114968 #include <cstdio> #includ ...
- HDU 1710 Binary Tree Traversals
题意:给出一颗二叉树的前序遍历和中序遍历,输出其后续遍历 首先知道中序遍历是左子树根右子树递归遍历的,所以只要找到根节点,就能够拆分出左右子树 前序遍历是按照根左子树右子树递归遍历的,那么可以找出这颗 ...
- CImage 获取图片RGB 、图片高和宽;
1 CImage img , img1 ,imDest; 2 img1.Load( 图片路径); 3 img.Load( 图片路径); 4 为了防止图片失真,先处理一下在把图片显示出来 5 SetSt ...
- U1 - A 留在电脑里的字体
U1系列新篇章,实战派!说说常用的字体! U1系列新篇章,实战派!更多干货更多关于软件的使用等即将放出,大家敬请期待!!
- 动态创建WebService
WebService应用主要是为远程提供接口服务,远程通过代理方式获取WebService资源:但是在现实应用过程中,在Web或者应用程序中如果想用生成远程代理,一般是借助vs里提供的 添加-添加we ...
- Android下载速度计算
long startTime = System.currentTimeMillis(); // 开始下载时获取开始时间 long curTime = System.currentTimeMillis( ...
- Delphi RichEx 图像
unit RichEx; {2005-03-04 LiChengbinAdded:Insert bitmap or gif into RichEdit controls from source fil ...
- UE 使用技巧
一.关于正则表达式的使用 删除空行: 替换 %[ ^t]++^p 为 空串 替换回车换行符:替换^p 为 空串 删除行尾空格: 替换 [ ^t]+$ 为 空串 删除行首空格: 替换 %[ ^t]+ 为 ...
- SQL数据库面试题以及答案
Student(stuId,stuName,stuAge,stuSex) 学生表 stuId:学号:stuName:学生姓名:stuAge:学生年龄:stuSex:学生性别 Course(course ...
- bzoj 1408 [Noi2002]Robot(欧拉函数)
[题目链接] http://www.lydsy.com/JudgeOnline/problem.php?id=1408 [题意] 求m的所有约数中,满足可以分解成(奇数个不同素数/偶数个不同素数/其 ...