当Erlang遇到Solr
当Erlang遇到Solr
Solr
Solr (pronounced "solar") is an open source enterprise search platform from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Providing distributed search and index replication, Solr is highly scalable. Solr is the most popular enterprise search engine. Solr 4 adds NoSQL features. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Apache Tomcat or Jetty. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it usable from most popular programming languages. Solr's powerful external configuration allows it to be tailored to many types of application without Java coding, and it has a plugin architecture to support more advanced customization.
esolr
|> Delete documents esolr:delete/2
|> Search esolr:search/3

%% 测试代码 -module(t). -compile(export_all). start()->
SearchUrl="http://192.168.0.160:8080/solr/hear_me/select",
UpdateUrl="http://192.168.0.160:8080/solr/hear_me/update",
MltUrl="http://192.168.0.160:8080/solr/hear_me/mlt",
{ok,Pid}=esolr:start([{select_url, SearchUrl}, {update_url, UpdateUrl}, {morelikethis_url, MltUrl}]),
Pid. search(SolrPid)->
esolr:search("10",[{fields,"*,*"}],SolrPid). add(SolrPid) ->
esolr:add([{doc,[{id,"ai234"}, {text,<<"Look me mom!, I'm searching now">>}]}],SolrPid),
esolr:add([{doc,[{id,"a3456"}, {text,<<"Look me mom!, I'm searching now">>}]}],SolrPid),
esolr:commit(SolrPid).

测试结果如下:

Eshell V5.9 (abort with ^G)
1> P=t:start().
<0.34.0>
2> t:add(P).
ok
3> esolr:search("searching",[{fields,"*,*"}],P).
{ok,[{"numFound",2},{"start",0}],
[{doc,[{"id",<<"ai234">>},
{"_version_",1440978100186775552}]},
{doc,[{"id",<<"a3456">>},
{"_version_",1440978100212989952}]}],
[]}
4> t:search(P).
{ok,[{"numFound",9},{"start",0}],
[{doc,[{"c_type",1},
{"c_tags",
[<<"女人">>,
<<230,148,190,229,188,131>>,
<<"å®¶åº">>,
<<229,165,179,229,143,139>>,
<<229,165,179,229,173,169,229,173,144>>,
<<229,176,143,229,173,169,229,173,144>>,
<<231,166,187,229,169,154>>,
<<229,135,186,230,137,139>>,
<<229,133,132,229,188,159>>]},
{"c_pub_date",<<"2013-07-12T16:29:11.593Z">>},
{"id",<<"97">>},
{"_version_",1440342611812417536}]},
{doc,[{"c_type",1},
{"c_tags",
[<<231,189,145,231,187,156>>,
<<229,165,179,229,143,139>>,
<<228,187,139,231,187,141>>,
<<233,171,152,228,184,173>>,
<<229,144,140,229,173,166>>,
<<230,156,139,229,143,139>>,
<<229,140,151,228,186,172>>,
..... ...

代码实现

make_post_request(Request,PendingInfo,
State=#esolr{update_url=URL,pending=P,auto_commit=AC,dirty=Dirty},
Timeout) ->
{ok,RequestId} = httpc:request(post,{URL,[{"connection", "close"}],"text/xml",Request},[{timeout,Timeout}],[{sync,false}]),
Pendings = gb_trees:insert(RequestId,PendingInfo,P),
if
(AC == always) and Dirty ->
CommitRequest = encode_commit(),
{ok,C_RequestId} = httpc:request(post,{URL,[{"connection", "close"}],"text/xml",CommitRequest},
[{timeout,State#esolr.commit_timeout}],[{sync,false}]),
Pendings2 = gb_trees:insert(C_RequestId,{auto,auto_commit},Pendings),
error_logger:info_report([{auto_commit,send}]),
{noreply,State#esolr{pending=Pendings2,dirty=false}}; true -> {noreply,State#esolr{pending=Pendings}}
end.


% @hidden
handle_info({http,{RequestId,HttpResponse}},State = #esolr{pending=P}) ->
case gb_trees:lookup(RequestId,P) of
{value,{Client,RequestOp}} -> handle_http_response(HttpResponse,RequestOp,Client),
{noreply,State#esolr{pending=gb_trees:delete(RequestId,P)}};
none -> {noreply,State}
%% the requestid isn't here, probably the request was deleted after a timeout
end; parse_search_response(Response,Client) ->
{value,{"response",{obj,SearchRespFields}},RestResponse} = lists:keytake("response",1, Response),
{value,{"docs",Docs},RespFields} = lists:keytake("docs",1,SearchRespFields),
gen_server:reply(Client,{ok,RespFields,[{doc,DocFields} || {obj,DocFields}<-Docs],RestResponse}).

Eshell V5.10.2 (abort with ^G)
1> xmerl:export_simple([{commit,[]}],xmerl_xml).
["<?xml version=\"1.0\"?>",[["<","commit","/>"]]]
2>
HTTPResponse解析还会用到xmerl_scan,xmerl_xpath

handle_http_response({{_HttpV,200,_Reason},_Headers,Data},Op,Client) ->
{Response,[]} = xmerl_scan:string(binary_to_list(Data)),
[Header] = xmerl_xpath:string("/response/lst[@name='responseHeader']",Response),
case parse_xml_response_header(Header) of
{ok,QTime} -> parse_xml_response(Op,Response,QTime,Client);
{error,Error} -> response_error(Op,Client,Error)
end;

除了XML之外,还要解析JSON,这里使用的是RFC4627.
扩展

当Erlang遇到Solr的更多相关文章
- [Erlang 0104] 当Erlang遇到Solr
Joe Armstrong的访谈中有一段关于"打开黑盒子"的阐述,给我留下很深的印象:Joe Armstrong在做XWindows开发时没有使用对应的类库,而是在了解XW ...
- Apache Solr vs Elasticsearch
http://solr-vs-elasticsearch.com/ Apache Solr vs Elasticsearch The Feature Smackdown API Feature Sol ...
- solr服务中集成IKAnalyzer中文分词器、集成dataimportHandler插件
昨天已经在Tomcat容器中成功的部署了solr全文检索引擎系统的服务:今天来分享一下solr服务在海量数据的网站中是如何实现数据的检索. 在solr服务中集成IKAnalyzer中文分词器的步骤: ...
- Solr 排除查询
前言 solr排除查询也就是我们在数据库和程序中经常处理的不等于,solr的语法是在定语前加[-].. StringBuilder sbHtml=new StringBuilder(); shBhtm ...
- Solr高级查询Facet
一.什么是facet solr种以导航为目的的查询结果成为facet,在用户查询的结果上根据分类增加了count信息,然后用户根据count信息做进一步搜索. facet主要用于导航实现渐进式精确搜索 ...
- [Solr] (源) Solr与MongoDB集成,实时增量索引
一. 概述 大量的数据存储在MongoDB上,需要快速搜索出目标内容,于是搭建Solr服务. 另外一点,用Solr索引数据后,可以把数据用在不同的项目当中,直接向Solr服务发送请求,返回xml.js ...
- sorl6.0+jetty+mysql搭建solr服务
1.下载solr 官网:http://lucene.apache.org/solr/ 2.目录结构如下 3.启动solr(默认使用jetty部署) 在path路径下将 bin文件夹对应的目录加入,然后 ...
- Solr Facet 默认值
前言 今天在用Solr Facet遇到了默认值的问题,我用Facet.field查询发现数据总共100条,刚开始没有注意,发现少个别数据,但是用这几个个别的id查询又能查出来数据.才发现是Facet默 ...
- solr添加多个core
在D:\solr\solr_web\solrhome文件夹下: 1)创建core0文件夹 2)复制D:\solr\solr_web\solrhome\configsets\basic_configs/ ...
随机推荐
- ActivityLifeCycle官方demo分解
1.左右Activity生命周期的若干条款: p=330">http://1.duoinfo.sinaapp.com/? p=330 http://1.duoinfo.sinaapp. ...
- mysql监控、性能调优及三范式理解
原文:mysql监控.性能调优及三范式理解 1监控 工具:sp on mysql sp系列可监控各种数据库 2调优 2.1 DB层操作与调优 2.1.1.开启慢查询 在My.cnf文件中添加如 ...
- mysql分表分库
单库单表 单库单表是最常见的数据库设计,例如,有一张用户(user)表放在数据库db中,所有的用户都可以在db库中的user表中查到. 单库多表 随着用户数量的增加,user表的数据量会越来越大,当数 ...
- 关于winlogo.exe中了“落雪”病毒的解决方法
Windows Logon Process,Windows NT 用户登陆程序,管理用户登录和退出.该进程的正常路径应是 C:\Windows\System32 且是以 SYSTEM 用户运行,若不是 ...
- PDFBox 介绍
根据官网的介绍可知,PDFBox是一个用来处理PDF文档的开源的Java工具包.这个项目运行创建PDF文档.对已有文档进行操作并且能够从文档中提取内容.它也包含了几个命令行工具.还有一点很重要,它是开 ...
- 2.1 LINQ的查询表达式
在进行LINQ查询的编写之前,首先需要了解查询表达式.查询表达式是LINQ查询的基础,也是最常用的编写LINQ查询的方法. 查询表达式由查询关键字和对应的操作数组成的表达式整体.其中,查询关键字是常用 ...
- QT最简单的程序执行过程分析(内含C++基础知识)
打开QT Creator,新建一个“应用程序-Qt Widgets Application”项目: 输入名称scdc之后点击下一步. 在“构建套件”这个页面中直接点出下一步,然后再输入自己的类名Dat ...
- 六大利器助Java程序开发事半功倍
实用的开发工具对于Java程序开发者来说,工作起来事半功倍.本文中小编将为大家列举包括开发环境.分析测试.代码保护等实用工具. 开发环境 Sonarqube Sonarqube是一个开源平台,是一款代 ...
- SDUTOJ 2054 双向链表
#include<iostream> #include<stdlib.h> using namespace std; typedef int ElemType; typedef ...
- 【转载】深度解析Android中字体设置
原文:http://mobile.51cto.com/android-265238.htm 1.在Android XML文件中设置字体 可以采用Android:typeface,例如android:t ...

