Elasticlunr.js

项目地址:http://elasticlunr.com/

代码地址:https://github.com/weixsong/elasticlunr.js

文档地址:http://elasticlunr.com/docs/index.html

Elasticlurn.js is a lightweight full-text search engine in Javascript for browser search and offline search.

Elasticlunr.js is developed based on Lunr.js, but more flexible than lunr.js. Elasticlunr.js provides Query-Time boosting and field search.

Elasticlunr.js is a bit like Solr, but much smaller and not as bright, but also provide flexible configuration and query-time boosting.

Key Features Comparing with Lunr.js

  • Query-Time boosting, you don’t need to setup boosting weight in index building procedure, this make it more flexible that you could try different boosting scheme.
  • More rational scoring mechanism, Elasticlunr.js use quite the same scoring mechanism as Elasticsearch, and also this scoring mechanism is used by lucene.
  • Field-search, you could choose which field to index and which field to search.
  • Boolean Model, you could set which field to search and the boolean model for each query token, such as “OR”, “AND”.
  • Combined Boolean Model, TF/IDF Model and the Vector Space Model, make the results ranking more reliable.
  • Fast, Elasticlunr.js removed TokenCorpus and Vector from lunr.js, by using combined model there is no need to compute the vector of a document and query string to compute similarity of query and matched document, this improve the search speed significantly.
  • Small index file, Elasticlunr.js did not store TokenCorpus because there is no need to compute query vector and document vector, then the index file is very small, this is especially helpful when elasticlurn.js is used as offline search.

Example

A very simple search index can be created using the following scripts:

var index = elasticlunr(function () {
this.addField('title');
this.addField('body');
this.setRef('id');
});

Adding documents to the index is as simple as:

var doc1 = {
"id": 1,
"title": "Oracle released its latest database Oracle 12g",
"body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
} var doc2 = {
"id": 2,
"title": "Oracle released its profit report of 2015",
"body": "As expected, Oracle released its profit report of 2015, during the good sales of database and hardware, Oracle's profit of 2015 reached 12.5 Billion."
} index.addDoc(doc1);
index.addDoc(doc2);

Then searching is as simple:

index.search("Oracle database profit");

Also, you could do query-time boosting by passing in a configuration.

index.search("Oracle database profit", {
fields: {
title: {boost: 2},
body: {boost: 1}
}
});

This returns a list of matching documents with a score of how closely they match the search query:

[{
"ref": 1,
"score": 0.5376053707962494
},
{
"ref": 2,
"score": 0.5237481076838757
}]

API documentation is available, as well as a full working example.

Description

Elasticlunr.js is developed based on Lunr.js, but more flexible than lunr.js. Elasticlunr.js provides Query-Time boosting and field search.

A bit like Solr, but much smaller and not as bright, but also provide flexible configuration and query-time boosting.

Why

  1. In some system, you don’t want to deploy any Web Server(such as Apache, Nginx, etc.), you only provide some static web pages and provide search function in client side. Then you could build index in previous and load index in client side.
  2. Provide offline search functionality. For some documents, user usually download these documents, you could build index and put index in the documents package, then provide offline search functionality.
  3. For some limited or restricted network, such WAN or LAN, offline search is a better choice.
  4. For mobile device, Iphone or Android phone, network traffic maybe very expensive, then provide offline search is a good choice.

Installation

Simply include the elasticlunr.js source file in the page that you want to use it. Elasticlunr.js is supported in all modern browsers.

Browsers that do not support ES5 will require a JavaScript shim for Elasticlunr.js to work. You can either use Augment.js, ES5-Shim or any library that patches old browsers to provide an ES5 compatible JavaScript environment.

Documentation

This part only contain important apects of elasticlunr.js, for the whole documentation, please go to API documentation.

1. Build Index

When you first create a index instance, you need to specify which field you want to index. If you did not specify which field to index, then no field will be searchable for your documents.

You could specify fields by:

var index = elasticlunr(function () {
this.addField('title');
this.addField('body');
this.setRef('id');
});

You could also set the document reference by this.setRef('id'), if you did not set document ref, elasticlunr.js will use ‘id’ as default.

You could do the above index setup as followings:

var index = elasticlunr();
index.addField('title');
index.addField('body');
index.setRef('id');

Default supported language of elasticlunr.js is English, if you want to use elasticlunr.js to index other language documents, then you need to use elasticlunr.js combined with lunr-languages.

Assume you’re using lunr-language in Node.js envrionment, you could import lunr-language as followings:

var lunr = require('./lib/lunr.js');
require('./lunr.stemmer.support.js')(lunr);
require('./lunr.de.js')(lunr); var idx = lunr(function () {
// use the language (de)
this.use(lunr.de);
// then, the normal lunr index initialization
this.field('title')
this.field('body')
});

For more details, please go to lunr-languages.

2. Add document to index

Add document to index is very simple, just prepare you document in JSON format, then add it to index.

var doc1 = {
"id": 1,
"title": "Oracle released its latest database Oracle 12g",
"body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
} var doc2 = {
"id": 2,
"title": "Oracle released its profit report of 2015",
"body": "As expected, Oracle released its profit report of 2015, during the good sales of database and hardware, Oracle's profit of 2015 reached 12.5 Billion."
} index.addDoc(doc1);
index.addDoc(doc2);

If your JSON document contains field that not configured in index, then that field will not be indexed, which means that field is not searchable.

3. Remove document from index

Elasticlunr.js support remove a document from index, just provide JSON document to elasticlunr.Index.prototype.removeDoc() function.

For example:

var doc = {
"id": 1,
"title": "Oracle released its latest database Oracle 12g",
"body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
} index.removeDoc(doc);

Remove a document will remove each token of that document’s each field from field-specified inverted index.

4. Update a document in index

Elasticlunr.js support update a document in index, just provide JSON document to elasticlunr.Index.prototype.update() function.

For example:

var doc = {
"id": 1,
"title": "Oracle released its latest database Oracle 12g",
"body": "Yestaday Oracle has released its new database Oracle 12g, this would make more money for this company and lead to a nice profit report of annual year."
} index.update(doc);

5. Query from Index

Elasticlunr.js provides flexible query configuration, supports query-time boosting and Boolean logic setting.

You could setup a configuration tell elasticlunr.js how to do query-time boosting, which field to search in, how to do the boolean logic.

Or you could just use it by simply provide a query string, this will aslo works perfectly because the scoring mechanism is very efficient.

5.1 Simple Query

Because elasticlunr.js has a very perfect scoring mechanism, so for most of your requirement, simple search would be easy to meet your requirement.

index.search("Oracle database profit");

Output is a results array, each element of results array is an Object contain a ref field and a score field.

ref is the document reference.

score is the similarity measurement.

Results array is sorted descent by score.

5.2 Configuration Query

5.2.1 Query-Time Boosting

Setup which fields to search in by passing in a JSON configuration, and setup boosting for each search field.

If you setup this configuration, then elasticlunr.js will only search the query string in the specified fields with boosting weight.

The scoring mechanism used in elasticlunr.js is very complex, please goto details for more information.

index.search("Oracle database profit", {
fields: {
title: {boost: 2},
body: {boost: 1}
}
});

5.2.2 Boolean Model

Elasticlunr.js also support boolean logic setting, if no boolean logic is setted, elasticlunr.js use “OR” logic defaulty. By “OR” default logic, elasticlunr.js could reach a high Recall.

index.search("Oracle database profit", {
fields: {
title: {boost: 2},
body: {boost: 1}
},
boolean: "OR"
});

Boolean operation is performed based on field. This means that if you choose “AND” logic, documents with all the query tokens in the query field will be returned as a field results. If you query in multiple fields, different field results will be merged together to give a final query results.

Elasticlunr.js 简单介绍的更多相关文章

  1. Node.js简单介绍并实现一个简单的Web MVC框架

    编号:1018时间:2016年6月13日16:06:41功能:Node.js简单介绍并实现一个简单的Web MVC框架URL :https://cnodejs.org/topic/4f16442cca ...

  2. 前端之JavaScript:JS简单介绍

    JavaScript(JS)之简单介绍 一.JavaScript的历史 1992年Nombas开发出C-minus-minus(C--)的嵌入式脚本语言(最初绑定在CEnvi软件中).后将其改名Scr ...

  3. JavaScript(一)js简单介绍

    JavaScript JS历史简述: javascript 是 netscape 网景公司 的  布兰德·艾奇  研发的, 网景要求  布兰德·艾奇 10天开发出来一个与Java相似 但要比java简 ...

  4. web前端----JavaScript(JS)简单介绍

    JavaScript(JS) 一.JavaScript的历史 1992年Nombas开发出C-minus-minus(C--)的嵌入式脚本语言(最初绑定在CEnvi软件中).后将其改名ScriptEa ...

  5. JS简单介绍与简单的基本语法

    1.JavaScirpt是一门编程语言,是为前端服务的一门语言. (1)基础语法 (2)数据类型 (3)函数 (4)面向对象 2.还涉及到BOM和DOM (1)BOM(操作浏览器的一些功能) (2)D ...

  6. Node.js简单介绍

    Node.js是一个能够让javascript执行在server上的平台,既是语言又是平台. Node.js是一个实时web应用程序的平台. Node.js有强大的包管理器npm,故node相关软件安 ...

  7. node.js当中的http模块与url模块的简单介绍

    一.http模块的简单介绍 node.js当中的http内置模块可以用于创建http服务器与http客户端. 1.引包 const http = require('http'); 2.创建http服务 ...

  8. 【FIORI系列】SAP OpenUI5 (SAPUI5) js框架简单介绍

    公众号:SAP Technical 本文作者:matinal 原文出处:http://www.cnblogs.com/SAPmatinal/ 原文链接:[FIORI系列]SAP OpenUI5 (SA ...

  9. JS获取各种宽度、高度的简单介绍:

    JS获取各种宽度.高度的简单介绍: scrollHeight: 获取对象的滚动高度. scrollLeft:设置或获取位于对象左边界和窗口中目前可见内容的最左端之间的距离 scrollTop:设置或获 ...

随机推荐

  1. Linux下使用JNI的常见问题及解决方案

    JNI是java和C/C++混合编程的接口,可以很方便地实现java调用C/C++语言.具体的使用方法,网上有很多教程,在此不做过多介绍.本博客只关注在使用JNI的过程中的常见问题. 1.     生 ...

  2. [重磅] 让HTML5达到原生的体验 系列之中的一个 避免切页白屏

    非常多人都想.甚至曾使用HTML5开发跨平台App.而且想达到原生App的体验. 最后的结果都是无奈的放弃.HTML5貌似美好,但坑太多.想做到原生App的体验差点儿不可为. 也曾有过著名的faceb ...

  3. Linux经常使用命令大全

    系统信息  arch 显示机器的处理器架构(1)  uname -m 显示机器的处理器架构(2)  uname -r 显示正在使用的内核版本号  dmidecode -q 显示硬件系统部件 - (SM ...

  4. ios打包应用程序,生成ipa文件

    假设我的程序调试好了,怎么才干发给别人用呢?正常情况下IPA文件是从Xcode的Organizer中输出的,可是我们没有证书,这样输出会产生错误. 以下教你怎样生成ipa文件: 1.到你当前proje ...

  5. 无法安装或运行此应用程序。该应用程序要求首先在"全局程序集缓存(GAC)"中安装程序集

    在做winform程序发布时遇到了这个问题,在我的机子上是可以正常运行的,但到别人的机子上就出现了这个错误.为此问题头疼了一上午终于搞定! 遇到这个问题一定是配置环境的原因, 1.你可以在程序  发布 ...

  6. iOS-tableView点击下拉菜单

    #import "ViewController.h" @interface ViewController ()<UITableViewDataSource,UITableVi ...

  7. VS2010程序打包操作(超详细的)

    1.  在vs2010 选择“新建项目”----“其他项目类型”----“Visual Studio Installerà“安装项目”: 命名为:Setup1 . 这是在VS2010中将有三个文件夹, ...

  8. Gengxin讲STL系列——Set

    本系列第二篇blog 第一篇写的心潮澎湃,结果写完一看,这都是些什么玩意= =| Set的中文名称是“集合”.集合,高一数学必修一课本给出的定义已经很明确了,简单来讲就是一个不含重复元素的空间(个人定 ...

  9. (原)ippicvmt.lib(ippinit.obj) : error LNK2005: _ippSetCpuFeatures@8 已经在 ippcoremt.lib(ippinit.obj) 中定义

    转载请注明出处: http://www.cnblogs.com/darkknightzh/p/5497234.html 参考网址: http://answers.opencv.org/question ...

  10. Virtual Environments

    virtualenv 再另一篇随笔中已经提到过virtualenv.如果我们有好几个不同的项目,他们需要的第三方包版本不同,那怎么办呢.这时候就需要virtualenv创建一个虚拟环境,里面包含了一个 ...