1.6.9 UIMA Integration
1. UIMA 集成
你可以使用solr集成Apache的非结构化信息管理架构(UIMA).UIMA可以让你定义自己的分析引擎通道,逐步添加元数据到文档的标注.
关于Solr UIMA的更多信息,参考https://wiki.apache.org/solr/SolrUIMA.
1.1 Configuring UIMA
solr UIMA的UpdateRequestProcessor是一个自定义的更新请求处理器.发送它们给UIMA管道,然后返回具有丰富元数据的文档.按照下面步骤配置UIMA:
1. solrconfig.xml,复制/solr-4.x.y/dist/solr-uima-4.x.y.jar包和它的contrib/uima/lib下面的类库到solr的类库目录下.
<lib dir="../../contrib/uima/lib" />
<lib dir="../../dist/" regex="solr-uima-\d.*\.jar" />
2.schema.xml中,添加元数据字段:
<field name="language" type="string" indexed="true" stored="true" required="false" />
<field name="concept" type="string" indexed="true" stored="true" multiValued="true" required="false" />
<field name="sentence" type="text" indexed="true" stored="true" multiValued="true" required="false" />
3.在solrconfig.xml中添加如下片段:
<updateRequestProcessorChain name="uima">
<processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
<lst name="uimaConfig">
<lst name="runtimeParameters">
<str name="keyword_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="concept_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="lang_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="cat_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="entities_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="oc_licenseID">VALID_OPENCALAIS_KEY</str>
</lst>
<str name="analysisEngine">
/org/apache/uima/desc/OverridingParamsExtServicesAE.xml
</st
r>
<!-- Set to true if you want to continue indexing even if text processing
fails. Default is false. That is, Solr throws RuntimeException and never
indexed documents entirely in your session. -->
<bool name="ignoreErrors">true</bool>
<!-- This is optional. It is used for logging when text processing fails.
If logField is not specified, uniqueKey will be used as logField. <str name="logField">id</str> -->
<lst name="analyzeFields">
<bool name="merge">false</bool>
<arr name="fields">
<str>text</str>
</arr>
</lst>
<lst name="fieldMappings">
<lst name="type">
<str name="name">org.apache.uima.alchemy.ts.concept.ConceptFS</str>
<lst name="mapping">
<str name="feature">text</str>
<str name="field">concept</str>
</lst>
</lst>
<lst name="type">
<str name="name">org.apache.uima.alchemy.ts.language.LanguageFS</str>
<lst name="mapping">
<str name="feature">language</str>
<str name="field">language</str>
</lst>
</lst>
<lst name="type">
<str name="name">org.apache.uima.SentenceAnnotation</str>
<lst name="mapping">
<str name="feature">coveredText</str>
<str name="field">sentence</str>
</lst>
</lst>
</lst>
</lst>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
4. 在solrconfig.xml中替换已经存在的UpdateRequestHandler或者创建新的UpdateRequestHandler.
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
<lst name="defaults">
<str name="update.processor">uima</str>
</lst>
</requestHandler>
1.6.9 UIMA Integration的更多相关文章
- 1.6 Indexing and Basic Data Operations--目录
1.6.1 什么是 Indexing 1.6.2 Uploading Data with Index Handlers 1.6.3 Uploading Data with Solr Cell usin ...
- 在 Laravel 中使用图片处理库 Integration/Image
系统需求 PHP >= 5.3 Fileinfo Extension GD Library (>=2.0) … or … Imagick PHP extension (>=6.5.7 ...
- 按照Enterprise Integration Pattern搭建服务系统
在前一篇文章中,我们已经对Enterprise Integration Pattern中所包含的各个组成进行了简单地介绍.限于篇幅(20页Word以内),我并没有深入地讨论各个组成.但是如果要真正地按 ...
- Enterprise Integration Pattern - 组成简介
近些年来,越来越多的Web应用正在逐渐向大型化的方向发展.它们通常都会包含一系列相互协作的子服务.在开发过程中,如何让这些子服务协同工作常常是软件开发人员所最为头疼的问题,如各个子服务之间的数据表示不 ...
- Spring 4 + Quartz 2.2.1 Scheduler Integration Example
In this post we will see how to schedule Jobs using Quartz Scheduler with Spring. Spring provides co ...
- OpenCASCADE Gauss Integration
OpenCASCADE Gauss Integration eryar@163.com Abstract. Numerical integration is the approximate compu ...
- MAGENTO - APACHE SOLR INTEGRATION - PART II (SETUP)
MAGENTO - APACHE SOLR INTEGRATION - PART II (SETUP) Tue, 03/01/2011 - 18:30 Tweet Development E-Comm ...
- POSTMAN as debugger for integration APPs
Chrome Menu: Window > Extensions > Postman - REST Client 0.8.4.10 起个标题,有空总结一下一个经验,关于Netsuite i ...
- [转](六)unity4.6Ugui中文教程文档-------概要-UGUI Animation Integration
5.Animation Integration(动画集成) 动画允许控件的所有状态之间相互转换,充分使用unity的动画系统.这是最强大的的转换模式的在处理很多属性的同时可以进行动画. 要使用动画转换 ...
随机推荐
- HDU 5858 Hard problem (数学推导)
Hard problem 题目链接: http://acm.split.hdu.edu.cn/showproblem.php?pid=5858 Description cjj is fun with ...
- mysql创建用户两次授权
mysql> GRANT ALL PRIVILEGES ON *.* TO 'monty'@'localhost' -> IDENTIFIED BY 'some_pass' ...
- 安装glibc2.7
参考链接http://fhqdddddd.blog.163.com/blog/static/18699154201401722649441/ /lib/libc.so.6: version `glib ...
- 使用paramiko来实现sftp
sftp是一个基于ssh的文件传输协议,是在Windows上往linux传送文件最常用的方式(例如SecureFX,Xftp). 在python下,paramiko实现了sftp,可以让大家方便地在代 ...
- 巧解Tomcat中JVM内存溢出问题
你对Tomcat 的JVM内存溢出问题的解决方法是否了解,这里和大家分享一下,相信本文介绍一定会让你有所收获. tomcat 的JVM内存溢出问题的解决 最近在熟悉一个开发了有几年的项目,需要把数据库 ...
- 安装Sass
最近要开始用 Sass 做一些东西.先来记录一下安装过程. 1.确认本机的 Ruby 版本 2.访问网址下载 Sass 最新版本 https://rubygems.org/gems/sass 3.下载 ...
- Client Dependency学习
Client Dependency Framework ---CDF CDF is a framework for managing CSS & JavaScript dependencies ...
- js 中使用工厂方法和构造器方法
1 直接创建对象 <!DOCTYPE html> <html> <head lang="en"> <meta charset=" ...
- Adding Multithreading Capability to Your Java Applications
src: http://www.informit.com/articles/article.aspx?p=26326&seqNum=3 Interrupting Threads A threa ...
- IDA Script: Remove empty auto labels
http://simeonpilgrim.com/blog/2010/03/25/ida-script-remove-empty-auto-labels/ #include <idc.idc&g ...