当Erlang遇到Solr
当Erlang遇到Solr
Solr
Solr (pronounced "solar") is an open source enterprise search platform from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Providing distributed search and index replication, Solr is highly scalable. Solr is the most popular enterprise search engine. Solr 4 adds NoSQL features. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Apache Tomcat or Jetty. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it usable from most popular programming languages. Solr's powerful external configuration allows it to be tailored to many types of application without Java coding, and it has a plugin architecture to support more advanced customization.
esolr
|> Delete documents esolr:delete/2
|> Search esolr:search/3

%% 测试代码 -module(t). -compile(export_all). start()->
SearchUrl="http://192.168.0.160:8080/solr/hear_me/select",
UpdateUrl="http://192.168.0.160:8080/solr/hear_me/update",
MltUrl="http://192.168.0.160:8080/solr/hear_me/mlt",
{ok,Pid}=esolr:start([{select_url, SearchUrl}, {update_url, UpdateUrl}, {morelikethis_url, MltUrl}]),
Pid. search(SolrPid)->
esolr:search("10",[{fields,"*,*"}],SolrPid). add(SolrPid) ->
esolr:add([{doc,[{id,"ai234"}, {text,<<"Look me mom!, I'm searching now">>}]}],SolrPid),
esolr:add([{doc,[{id,"a3456"}, {text,<<"Look me mom!, I'm searching now">>}]}],SolrPid),
esolr:commit(SolrPid).

测试结果如下:

Eshell V5.9 (abort with ^G)
1> P=t:start().
<0.34.0>
2> t:add(P).
ok
3> esolr:search("searching",[{fields,"*,*"}],P).
{ok,[{"numFound",2},{"start",0}],
[{doc,[{"id",<<"ai234">>},
{"_version_",1440978100186775552}]},
{doc,[{"id",<<"a3456">>},
{"_version_",1440978100212989952}]}],
[]}
4> t:search(P).
{ok,[{"numFound",9},{"start",0}],
[{doc,[{"c_type",1},
{"c_tags",
[<<"女人">>,
<<230,148,190,229,188,131>>,
<<"å®¶åº">>,
<<229,165,179,229,143,139>>,
<<229,165,179,229,173,169,229,173,144>>,
<<229,176,143,229,173,169,229,173,144>>,
<<231,166,187,229,169,154>>,
<<229,135,186,230,137,139>>,
<<229,133,132,229,188,159>>]},
{"c_pub_date",<<"2013-07-12T16:29:11.593Z">>},
{"id",<<"97">>},
{"_version_",1440342611812417536}]},
{doc,[{"c_type",1},
{"c_tags",
[<<231,189,145,231,187,156>>,
<<229,165,179,229,143,139>>,
<<228,187,139,231,187,141>>,
<<233,171,152,228,184,173>>,
<<229,144,140,229,173,166>>,
<<230,156,139,229,143,139>>,
<<229,140,151,228,186,172>>,
..... ...

代码实现

make_post_request(Request,PendingInfo,
State=#esolr{update_url=URL,pending=P,auto_commit=AC,dirty=Dirty},
Timeout) ->
{ok,RequestId} = httpc:request(post,{URL,[{"connection", "close"}],"text/xml",Request},[{timeout,Timeout}],[{sync,false}]),
Pendings = gb_trees:insert(RequestId,PendingInfo,P),
if
(AC == always) and Dirty ->
CommitRequest = encode_commit(),
{ok,C_RequestId} = httpc:request(post,{URL,[{"connection", "close"}],"text/xml",CommitRequest},
[{timeout,State#esolr.commit_timeout}],[{sync,false}]),
Pendings2 = gb_trees:insert(C_RequestId,{auto,auto_commit},Pendings),
error_logger:info_report([{auto_commit,send}]),
{noreply,State#esolr{pending=Pendings2,dirty=false}}; true -> {noreply,State#esolr{pending=Pendings}}
end.


% @hidden
handle_info({http,{RequestId,HttpResponse}},State = #esolr{pending=P}) ->
case gb_trees:lookup(RequestId,P) of
{value,{Client,RequestOp}} -> handle_http_response(HttpResponse,RequestOp,Client),
{noreply,State#esolr{pending=gb_trees:delete(RequestId,P)}};
none -> {noreply,State}
%% the requestid isn't here, probably the request was deleted after a timeout
end; parse_search_response(Response,Client) ->
{value,{"response",{obj,SearchRespFields}},RestResponse} = lists:keytake("response",1, Response),
{value,{"docs",Docs},RespFields} = lists:keytake("docs",1,SearchRespFields),
gen_server:reply(Client,{ok,RespFields,[{doc,DocFields} || {obj,DocFields}<-Docs],RestResponse}).

Eshell V5.10.2 (abort with ^G)
1> xmerl:export_simple([{commit,[]}],xmerl_xml).
["<?xml version=\"1.0\"?>",[["<","commit","/>"]]]
2>
HTTPResponse解析还会用到xmerl_scan,xmerl_xpath

handle_http_response({{_HttpV,200,_Reason},_Headers,Data},Op,Client) ->
{Response,[]} = xmerl_scan:string(binary_to_list(Data)),
[Header] = xmerl_xpath:string("/response/lst[@name='responseHeader']",Response),
case parse_xml_response_header(Header) of
{ok,QTime} -> parse_xml_response(Op,Response,QTime,Client);
{error,Error} -> response_error(Op,Client,Error)
end;

除了XML之外,还要解析JSON,这里使用的是RFC4627.
扩展

当Erlang遇到Solr的更多相关文章
- [Erlang 0104] 当Erlang遇到Solr
Joe Armstrong的访谈中有一段关于"打开黑盒子"的阐述,给我留下很深的印象:Joe Armstrong在做XWindows开发时没有使用对应的类库,而是在了解XW ...
- Apache Solr vs Elasticsearch
http://solr-vs-elasticsearch.com/ Apache Solr vs Elasticsearch The Feature Smackdown API Feature Sol ...
- solr服务中集成IKAnalyzer中文分词器、集成dataimportHandler插件
昨天已经在Tomcat容器中成功的部署了solr全文检索引擎系统的服务:今天来分享一下solr服务在海量数据的网站中是如何实现数据的检索. 在solr服务中集成IKAnalyzer中文分词器的步骤: ...
- Solr 排除查询
前言 solr排除查询也就是我们在数据库和程序中经常处理的不等于,solr的语法是在定语前加[-].. StringBuilder sbHtml=new StringBuilder(); shBhtm ...
- Solr高级查询Facet
一.什么是facet solr种以导航为目的的查询结果成为facet,在用户查询的结果上根据分类增加了count信息,然后用户根据count信息做进一步搜索. facet主要用于导航实现渐进式精确搜索 ...
- [Solr] (源) Solr与MongoDB集成,实时增量索引
一. 概述 大量的数据存储在MongoDB上,需要快速搜索出目标内容,于是搭建Solr服务. 另外一点,用Solr索引数据后,可以把数据用在不同的项目当中,直接向Solr服务发送请求,返回xml.js ...
- sorl6.0+jetty+mysql搭建solr服务
1.下载solr 官网:http://lucene.apache.org/solr/ 2.目录结构如下 3.启动solr(默认使用jetty部署) 在path路径下将 bin文件夹对应的目录加入,然后 ...
- Solr Facet 默认值
前言 今天在用Solr Facet遇到了默认值的问题,我用Facet.field查询发现数据总共100条,刚开始没有注意,发现少个别数据,但是用这几个个别的id查询又能查出来数据.才发现是Facet默 ...
- solr添加多个core
在D:\solr\solr_web\solrhome文件夹下: 1)创建core0文件夹 2)复制D:\solr\solr_web\solrhome\configsets\basic_configs/ ...
随机推荐
- 让Docker功能更强大的10个开源工具
让Docker功能更强大的10个开源工具 更好的管理.Web前端程序.更深入地了解容器应用程序,Docker生态系统正在迅速发展,这还得归功于其充满活力的开源社区. 软件项目的成功常常根据其催生的生态 ...
- 所谓策略,我站在旁边看今天 神刻的样子O2O
雕爷.何许人也? 卖牛腩的大叔? 卖精油的大爷?还是卖烤肉的家伙? 事实上以上答案都是肯定的,他就是阿芙精油,雕爷牛腩创业神话的缔造者.那么雕爷是怎样取得这种创业成功的呢?前段时间我还不清楚雕爷的厉害 ...
- OC第四课
主要内容:NSString.NSArray.NSNumber 一.苹果公司的帮助文档(API) 学会查看API对于后续的编程有很好的帮助 进入方法: Xcode ->Help -> Doc ...
- 采用malloc分别分配2KB个人空间,然后,realloc调整到6KB、1MB、3MB、10MB场地,分别这五内存“A”、“B”、“C”、“D”、“E”灌装
#include<stdio.h> #include<stdlib.h> #include<string.h> #include<malloc.h> i ...
- 使用collectd与visage收集kvm虚拟机性能实时图形
软件功能: 通过collectd软件来监控收集kvm虚拟机的性能数据,包含cpu,memory.磁盘IO.网络流量等 通过visage软件将收集到的数据绘制图形. 安装: 系统环境:ubuntu12. ...
- Android开发:怎样定制界面风格
统一的用户界面是可以使得应用程序更友好.要做到用户界面的统一,我们就必须用到风格(style)和主题(theme).OPhone系统提供了很多系统默认的风格和主题,但是很多情况下,这些不能满足我们的需 ...
- .NET开源项目 TOP 25
.NET开源项目 TOP 25 如果知道.NET项目在开源中国的git上所占的比重只有5%的话,为什么这个<2014年国人开发的最热门的开源软件TOP 100>榜中.NET项目那么少就是情 ...
- DB2建表语句
db2 => create table test (name char(8) not null primary key,depid smallint,pay bigint) DB20000I S ...
- 单元测试之Qunit
单元测试之Qunit 前言 因为公司开发了一套javascript SDK需要测试,在网上找了很久,找到了JQuery团队开发的QUnit,和基于JUnit的JsUnit,还有一些还没有看,先讲讲QU ...
- apache启动报错:the requested operation has failed解决办法
原因一:80端口占用 例如IIS,另外就是迅雷.我的apache服务器就是被迅雷害得无法启用! 原因二:软件冲突 装了某些软件会使apache无法启动如Dr.com 你打开网络连接->TcpIp ...

