sphinx增量索引

首先建立一个计数表，保存数据表的最新记录ID

CREATE TABLE `sph_counter` (
`id` int(11) unsigned NOT NULL,
`max_id` int(11) unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='sphinx增量表最大记录数';

#定义主索引源

source test
{
    type                    = mysql
    sql_host                = localhost
    sql_user                = root
    sql_pass                = 8888
    sql_db                    = test
    sql_port                = 3306
    sql_query_pre            = SET NAMES utf8
    sql_query_pre           = REPLACE INTO sph_counter SELECT 1, MAX(id) FROM test where status=1 #取最大记录数

    sql_query = select id from test where id<(select max_id from sph_counter where id=1) and status = 1

　##如果这里不加id<的条件，合并索引时会报字段数不匹配的错误

　#FATAL: failed to merge index 'test_delta' into index 'test': fulltext fields count mismatch (me=/usr/local/sphinx/var/data/test, in=/usr/local/sphinx/var/data/test_delta, myfields=4, infields=5)
sql_query_info = select * from test where id = $id
}

#增量索引数据源定义
source test_delta : test
{
        sql_query_pre = SET NAMES utf8
        sql_query = select * from test where id>=(select max_id from sph_counter where id=1) and status = 1
        sql_query_info = select * from test where id = $id

}

#定义主索引

index test
{
    source            = test            #对应的source名称
    path            = /usr/local/sphinx/var/data/test #请修改为实际使用的绝对路径，例如：/usr/local/coreseek/var/...
    docinfo            = extern
    mlock            = 0
    morphology        = none
    min_word_len        = 2
    html_strip                = 1

    #中文分词配置，详情请查看：http://www.coreseek.cn/products-install/coreseek_mmseg/
    charset_dictpath = /usr/local/mmseg/etc/ #BSD、Linux环境下设置，/符号结尾
    #charset_dictpath = etc/                             #Windows环境下设置，/符号结尾，最好给出绝对路径，例如：C:/usr/local/coreseek/etc/...
    charset_type        = zh_cn.utf-8
}
#定义增量索引
index test_delta:test
{
    source            = test_delta            #对应的source名称
    path            = /usr/local/sphinx/var/data/test_delta #请修改为实际使用的绝对路径，例如：/usr/local/coreseek/var/...
    docinfo            = extern
    mlock            = 0
    morphology        = none
    min_word_len        = 2
    html_strip                = 1

    #中文分词配置，详情请查看：http://www.coreseek.cn/products-install/coreseek_mmseg/
    charset_dictpath = /usr/local/mmseg/etc/ #BSD、Linux环境下设置，/符号结尾
    #charset_dictpath = etc/                             #Windows环境下设置，/符号结尾，最好给出绝对路径，例如：C:/usr/local/coreseek/etc/...
    charset_type        = zh_cn.utf-8
}

#全局index定义
indexer
{
    mem_limit            = 128M
}

#searchd服务定义
searchd
{
    listen                  =   9312
    read_timeout        = 5
    max_children        = 30
    max_matches            = 1000
    seamless_rotate        = 0
    preopen_indexes        = 0
    unlink_old            = 1
    pid_file = /usr/local/sphinx/var/log/searchd_mysql.pid #请修改为实际使用的绝对路径，例如：/usr/local/coreseek/var/...
    log = /usr/local/sphinx/var/log/searchd_mysql.log        #请修改为实际使用的绝对路径，例如：/usr/local/coreseek/var/...
    query_log = /usr/local/sphinx/var/log/query_mysql.log #请修改为实际使用的绝对路径，例如：/usr/local/coreseek/var/...
    binlog_path =                                #关闭binlog日志
}

保存配置文件后退出，先停止searchd进程再启动，然后重新生成索引。

停止进程
/usr/local/sphinx/bin/searchd -c /usr/local/sphinx/etc/csft.conf --stop

启动进程
/usr/local/sphinx/bin/searchd -c /usr/local/sphinx/etc/csft.conf

重新生成所有索引
/usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf --all --rotate
增量索引
/usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf test_delta --rotate
合并索引
/usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf --merge test test_delta --rotate

如果合并索引时出现下面问题：

FATAL: failed to merge index 'test_delta' into index 'test': source index preload failed: failed to open /usr/local/sphinx/var/data/test_delta.sph: No such file or directory

停止searchd进程，然后重新启动searchd进程。

增量索引可以放在crontab里根据需要设置几分钟运行一次，然后执行索引合并，至于主索引重建可以选择在访问量不大或者半夜运行。

##每5分钟运行增量索引

*/5 * * * /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf test_delta --rotate > /dev/null 2>&1

##每10分钟执行一次增量索引合并

*/10 * * * /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf --merge test test_delta --rotate

##凌晨0点5分重新建立主索引

5 0 * * * /usr/local/sphinx/bin/indexer -c /usr/local/sphinx/etc/csft.conf --all --rotate > /dev/null 2>&1

sphinx增量索引的更多相关文章

sphinx 增量索引实现近实时更新
一.sphinx增量索引的设置数据库中的已有数据很大,又不断有新数据加入到数据库中,也希望能够检索到.全部重新建立索引很消耗资源,因为我们需要更新的数据相比较而言很少.例如.原来的数据有几百万条 ...
Sphinx 增量索引更新
是基于PHP API调用,而不是基于sphinxSE.现在看来sphinxSE比API调用更简单的多,因为之前没有想过sphinxSE,现在先把API的弄明白.涉及到的:sphinx 数据源的设置,简 ...
sphinx 增量索引及时更新、sphinx indexer索引合成时去旧和过滤办法(转)
一.sphinx增量索引的设置数据库中的已有数据很大,又不断有新数据加入到数据库中,也希望能够检索到.全部重新建立索引很消耗资源,因为我们需要更新的数据相比较而言很少.例如.原来的数据有几百万 ...
sphinx增量索引使用
sphinx在使用过程中如果表的数据量很大,新增加的内容在sphinx索引没有重建之前都是搜索不到的. 这时可以通过建立sphinx增量索引,通过定时更新增量索引,合并主索引的方式,来实现伪实时更新. ...
sphinx增量索引和主索引来实现索引的实时更新
项目中文章的信息内容因为持续有新增,而文章总量的基数又比较大,所以做搜索的时候,用了主索引+增量索引这种方式来实现索引的实时更新. 实现原理: 1. 新建一张表,记录一下上一次已经创建好索引的最后一条 ...
sphinx 增量索引与主索引使用测试
2013年10月28日 15:01:16 首先对新增的商品建立增量索引,搜索时只使用增量索引: array (size=1) 0 => array (size=6) 'gid' => st ...
sphinx通过增量索引实现近实时更新
一.sphinx增量索引实现近实时更新设置数据库中的已有数据很大,又不断有新数据加入到数据库中,也希望能够检索到.全部重新建立索引很消耗资源,因为我们需要更新的数据相比较而言很少. 例如.原来的数据 ...
sphinx （coreseek）——3、区段查询与增量索引实例
首先本文测试数据100多万的域名的wwwtitle 信息检索数据: 首先建立临时表格: CREATE TABLE `sph_counter` ( `index_id` ) NOT NULL, `m ...
sphinx（coreseek）——1、增量索引
首先介绍一下 CoreSeek/Sphinx的发布包 indexer: 用于创建全文索引; search: 一个简单的命令行(CLI) 的测试程序,用于测试全文索引; search ...

随机推荐

Scrum 项目 6.0
-------------------------6.0------------------------------------ sprint演示 1.坚持所有的sprint都结束于演示. 团队的成果 ...
浏览器 UserAgent 相关知识整理
总结整理时下流行的浏览器User-Agent大全浏览器userAgent大全各种浏览器UserAgent一览表(桌面+移动) 使用JS判断移动设备的终端类型(浏览器UserAgent) JS通过分 ...
利用IronJs在.NET程序里面跑javascript脚本
what’s dlr The dynamic language runtime (DLR) is a runtime environment that adds a set of services f ...
MVC5+EF6 入门完整教程12--灵活控制Action权限
大家久等了. 本篇专题主要讲述MVC中的权限方案. 权限控制是每个系统都必须解决的问题,也是园子里讨论最多的专题之一. 前面的系列文章中我们用到了 SysUser, SysRole, SysUserR ...
oop典型应用：实体类
1.什么是实体类简单地说就是描述一个业务实体的“类”,业务实体直观一点理解就是整个就是整个软件系统业务所涉及的对象. eg:MySchool系统中的班级,学生,年级等都是业务实体,“雷电”游戏中的飞 ...
什么是CSR证书申请文件？
CSR是Cerificate Signing Request的英文缩写,即证书请求文件,在多方之间在互联网上安全分享数据的公钥基础架构PKI系统中,CSR文件必须在申请和购买SSL证书之前创建.也 ...
DWR的Reverse Ajax技术实现
DWR的逆向ajax其实主要包括两种模式:主动模式和被动模式.其中主动模式包括Polling和Comet两种,被动模式只有Piggyback这一种. 所谓的Piggyback指的是如果后台有什么内容需 ...
DOM笔记整理及应用实例
一.前言当网页被加载时,浏览器会创建页面的文档对象模型(Document Object Model).HTML DOM 模型被构造为对象的树通过可编程的对象模型,JavaScript 获得了足够的 ...
StackOverflow Update: 560M Pageviews A Month, 25 Servers, And It's All About Performance
http://highscalability.com/blog/2014/7/21/stackoverflow-update-560m-pageviews-a-month-25-servers-and ...
javascript Array 方法学习
原生对象Array学习 Array.from() 从类似数组的对象或可迭代的对象返回一个数组参数列表 arraylike 类似数组的对象或者可以迭代的对象 mapfn(可选) 对对象遍历映 ...

sphinx增量索引

sphinx增量索引的更多相关文章

随机推荐

热门专题