Performs the analysis process on a text and return the tokens breakdown of the text

a-du 2024-09-26 09:12:44 原文

Analyzeedit

Performs the analysis process on a text and return the tokens breakdown of the text.

Can be used without specifying an index against one of the many built in analyzers:

GET _analyze

{

  "analyzer" : "standard",

  "text" : "this is a test"

}

COPY AS CURL VIEW IN CONSOLE

If text parameter is provided as array of strings, it is analyzed as a multi-valued field.

GET _analyze

{

  "analyzer" : "standard",

  "text" : ["this is a test", "the second text"]

}

COPY AS CURL VIEW IN CONSOLE

Or by building a custom transient analyzer out of tokenizers, token filters and char filters. Token filters can use the shorter filter parameter name:

GET _analyze

{

  "tokenizer" : "keyword",

  "filter" : ["lowercase"],

  "text" : "this is a test"

}

COPY AS CURL VIEW IN CONSOLE

GET _analyze

{

  "tokenizer" : "keyword",

  "filter" : ["lowercase"],

  "char_filter" : ["html_strip"],

  "text" : "this is a <b>test</b>"

}

COPY AS CURL VIEW IN CONSOLE

Deprecated in 5.0.0.

Use filter/char_filter instead of filters/char_filters and token_filters has been removed

Custom tokenizers, token filters, and character filters can be specified in the request body as follows:

GET _analyze

{

  "tokenizer" : "whitespace",

  "filter" : ["lowercase", {"type": "stop", "stopwords": ["a", "is", "this"]}],

  "text" : "this is a test"

}

COPY AS CURL VIEW IN CONSOLE

It can also run against a specific index:

GET twitter/_analyze

{

  "text" : "this is a test"

}

COPY AS CURL VIEW IN CONSOLE

The above will run an analysis on the "this is a test" text, using the default index analyzer associated with the test index. An analyzer can also be provided to use a different analyzer:

GET twitter/_analyze

{

  "analyzer" : "whitespace",

  "text" : "this is a test"

}

COPY AS CURL VIEW IN CONSOLE

Also, the analyzer can be derived based on a field mapping, for example:

GET twitter/_analyze

{

  "field" : "obj1.field1",

  "text" : "this is a test"

}

COPY AS CURL VIEW IN CONSOLE

Will cause the analysis to happen based on the analyzer configured in the mapping for obj1.field1(and if not, the default index analyzer).

Deprecated in 5.1.0 request parameters are deprecated and will be removed in the next major release. please use JSON params instead of request params.

All parameters can also supplied as request parameters. For example:

GET /_analyze?tokenizer=keyword&filter=lowercase&text=this+is+a+test

COPY AS CURL VIEW IN CONSOLE

For backwards compatibility, we also accept the text parameter as the body of the request, provided it doesn’t start with { :

curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&filter=lowercase&char_filter=reverse' -d 'this is a test' -H 'Content-Type: text/plain'

Deprecated in 5.1.0 the text parameter as the body of the request are deprecated and this feature will be removed in the next major release. please use JSON text param

Performs the analysis process on a text and return the tokens breakdown of the text的更多相关文章

Oracle Error - "OCIEnvCreate failed with return code -1 but error message text was not available".
ISSUE: When trying to connect to an Oracle database you receive the following error: "OCIEnvCre ...
论文阅读（Weilin Huang——【arXiv2016】Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network）
Weilin Huang——[arXiv2016]Accurate Text Localization in Natural Image with Cascaded Convolutional Tex ...
kettle连接oracle报错oracle.i18n.text.converter.CharacterConverter.OGS.getInstance(I)Loracle/i18n/text/converter/CharacterConverter
问题背景1:需要将一张excel中的数据导入到数据库中,并且还有关联转换和去重的处理问题,且此excel表不是固定的,需要写一个脚本当新的excel拿来的时候,可以直接导入即可.所以我想用kettl ...
js让text值不可改变，同<input type="text" readonly="readonly" />
<input type="text" size="60" name="j01" value="www.52jscn.com ...
关于客户端发现响应内容类型为“text/html; charset=utf-8”，但应为“text/xml”的解决方法
http://www.cnblogs.com/jams742003/archive/2008/10/30/1322761.html 请求web服务时,会有如题的异常出现,解决方法如下: 1 检查web ...
Sublime Text 3 修改插件安装位置【sublime text、插件路径、Data】
直接切入正题,在享受Sublime 插件给我们带来开发效率的同时,有些插件的文件也是很大的,但是插件默认安装的位置是AppData的目录[C:\Users\用户名\AppData\Roaming\Su ...
requests之headers 'Content-Type': 'text/html'误判encoding为'ISO-8859-1'导致中文text解码错误
0. requests不设置UA 访问baidu 得到 r.headers['Content-Type'] 是text/html 使用chrome UA: Content-Type:text/htm ...
selenium.common.exceptions.UnexpectedAlertPresentException: Alert Text: None；Message: unexpected alert open: {Alert text : 您点击的频率过快！请稍后再试}
报错 Traceback (most recent call last): File "C:/myFiles/code/cnki/cnki_1/core/knavi.py", li ...
MySQL 数据类型对比：char 与 varchar；varchar 与 text；datetime 与 timestamp；blob 与 text；
char 与 varchar char(n) 若存入字符数小于n,则以空格补于其后,查询之时再将空格去掉.所以 char 类型存储的字符串末尾不能有空格,varchar 不限于此. char(n) 固 ...

随机推荐

grep 过滤.svn文件
[grep 过滤.svn文件] 问题: 在repository搜索代码时,常常会搜索到.svn的代码,如果不想搜索.svn目录下的相关代码怎么办? 1.使用管道进行双层“过滤”,其中第二次gre ...
Java常用的输出调试技巧
--------siwuxie095 Eclipse 开发中常用的输出调试技巧: 先在左侧的 Package Explorer,右键->New->J ...
C#开发微信公众化平台
C#开发微信公众化平台写在前面服务号和订阅号 URL配置创建菜单查询.删除菜单接受消息发送消息(图文.菜单事件响应) 示例Demo下载后记最近公司在做微信开发,其实就是接口开发,网上 ...
golang之math/rand随机数
简单的随机数生成,结合时间模块初始化种子 package main import ( "fmt" "math/rand" "time" ) ...
java 主类的main方法调用其他方法
方法1:A a=new test().new A(); 内部类对象通过外部类的实例对象调用其内部类构造方法产生,如下: public class test{ class A{ void fA(){ S ...
MyBatis 实用篇（二）配置文件
MyBatis 实用篇(二)配置文件一.全局配置全局配置:http://www.mybatis.org/mybatis-3/zh/configuration.html <?xml versi ...
【记录】CentOS7安装NODEBB
NodeBB介绍: NodeBB 是一个更好的论坛平台,专门为现代网络打造.它是免费的,易于使用. NodeBB 论坛软件是基于 Node.js开发,支持 Redis 或 MongoDB 的数据库.它 ...
xcconfig
[xcconfig] 1.When you can use a .xcconfig file? Use .xcconfig files if you find yourself changing th ...
Oracle查看字符集
select * from nls_database_parameters where parameter like 'NLS%CHARACTERSET';
Redis安装系统服务1073错误
报错:D:\ProgramFiles\redis>redis-server.exe --service-install redis.windows.conf --loglevel verbose ...