Solr7.1---数据库导入并建立中文分词器

这里只是告诉你如何导入，生产环境不要这样部署你的solr服务。

首先修改solrConfig.xml文件

备份_default文件夹

修改solrconfig.xml

加入如下内容

官方示例：
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">

  <lst name="defaults">

    <str name="config">/path/to/my/DIHconfigfile.xml</str>

  </lst>

</requestHandler>

效果：

在conf目录建立一个db-data-config.xml文件

<dataConfig>

    <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/demo" user="root" password="" />

    <document>

        <entity name="bless" query="select * from bless"

                deltaQuery="select bless_id from bless where bless_time > '${dataimporter.last_index_time}'">

            <field column="BLESS_ID" name="blessId" />

            <field column="BLESS_CONTENT" name="blessContent" />

            <field column="BLESS_TIME" name="blessTime" />

        </entity>

    </document>

</dataConfig>

我的数据库

复制jar

找到这个：

连同mysql驱动包一起复制到

找到自带的中文分词器

复制到webapp的lib目录

修改managed-shchema

在最后加入如下中文配置

    <!-- ChineseAnalyzer -->

    <fieldType name="solr_cnAnalyzer" class="solr.TextField" positionIncrementGap="100">

      <analyzer type="index">

        <tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory"/>

      </analyzer>

      <analyzer type="query">

        <tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory"/>

      </analyzer>

    </fieldType>

下面以cloud模式启动

整个过程只需要输入索引集合的名称，其他都是一路回车。

D:\>cd solr-7.1.

D:\solr-7.1.>bin\solr start -e cloud

Welcome to the SolrCloud example!

This interactive session will help you launch a SolrCloud cluster on your local

workstation.

To begin, how many Solr nodes would you like to run in your local cluster? (spec

ify - nodes) []:

【回车】

Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.

Please enter the port for node1 []:

【回车】

Please enter the port for node2 []:

【回车】

Solr home directory D:\solr-7.1.\example\cloud\node1\solr already exists.

D:\solr-7.1.\example\cloud\node2 already exists.

Starting up Solr on port  using command:

"D:\solr-7.1.0\bin\solr.cmd" start -cloud -p  -s "D:\solr-7.1.0\example\clou

d\node1\solr"

Waiting up to  to see Solr running on port 

Starting up Solr on port  using command:

"D:\solr-7.1.0\bin\solr.cmd" start -cloud -p  -s "D:\solr-7.1.0\example\clou

d\node2\solr" -z localhost:9983

Started Solr server on port . Happy searching!

Waiting up to  to see Solr running on port

INFO  - -- ::02.823; org.apache.solr.client.solrj.impl.ZkClientClust

erStateProvider; Cluster at localhost: ready

Now let's create a new collection for indexing documents in your 2-node cluster.

Please provide a name for your new collection: [gettingstarted]

Started Solr server on port . Happy searching!

bless【输入名称并回车】

How many shards would you like to split bless into? []

【回车】

How many replicas per shard would you like to create? []

【回车】

Please choose a configuration for the bless collection, available options are:

_default or sample_techproducts_configs [_default]

【回车】

Created collection 'bless' with  shard(s),  replica(s) with config-set 'bless'

Enabling auto soft-commits with maxTime  secs using the Config API

POSTing request to Config API: http://localhost:8983/solr/bless/config

{"set-property":{"updateHandler.autoSoftCommit.maxTime":""}}

Successfully set-property updateHandler.autoSoftCommit.maxTime to 

SolrCloud example running, please visit: http://localhost:8983/solr

D:\solr-7.1.>

下面访问

选择bless

然后选择Schema，来配置字段【注意：这里的名字要与数据库中的字段名一模一样！！！】

bless_id

bless_content

bless_time

点击DataImport

要注意勾选Auto-Refresh Status

现在点击Query。可以看到，数据库中的数据都导入了。

下面看一下中文分词

看起来还不错。查询试试看。

发现0条数据，至少也得有一条啊！然而如果我指定默认搜索字段。会发现出来了。

试试搜索【心】

关于数据库的文件，大家如果想要用来测试可以GitHub

Solr7.1---数据库导入并建立中文分词器的更多相关文章

Solr7.2.1环境搭建和配置ik中文分词器
solr7.2.1环境搭建和配置ik中文分词器安装环境:Jdk 1.8. windows 10 安装包准备: solr 各种版本集合下载:http://archive.apache.org/dist ...
solr 7+tomcat 8 + mysql实现solr 7基本使用(安装、集成中文分词器、定时同步数据库数据以及项目集成)
基本说明 Solr是一个开源项目,基于Lucene的搜索服务器,一般用于高级的搜索功能: solr还支持各种插件(如中文分词器等),便于做多样化功能的集成: 提供页面操作,查看日志和配置信息,功能全面 ...
真分布式SolrCloud+Zookeeper+tomcat搭建、索引Mysql数据库、IK中文分词器配置以及web项目中solr的应用(1)
版权声明:本文为博主原创文章,转载请注明本文地址.http://www.cnblogs.com/o0Iris0o/p/5813856.html 内容介绍: 真分布式SolrCloud+Zookeepe ...
solr7.2安装实例，中文分词器
一.安装实例 1.创建实例目录 [root@node004]# mkdir -p /usr/local/solr/home/jonychen 2.复制实例相关配置文件 [root@node004]# ...
Solr7.3.0入门教程，部署Solr到Tomcat，配置Solr中文分词器
solr 基本介绍 Apache Solr (读音: SOLer) 是一个开源的搜索服务器.Solr 使用 Java 语言开发,主要基于 HTTP 和 Apache Lucene 实现.Apache ...
Lucene全文检索_分词_复杂搜索_中文分词器
1 Lucene简介 Lucene是apache下的一个开源的全文检索引擎工具包. 1.1 全文检索(Full-text Search) 1.1.1 定义全文检索就是先分词创建索引,再执行搜索的过 ...
11大Java开源中文分词器的使用方法和分词效果对比，当前几个主要的Lucene中文分词器的比较
本文的目标有两个: 1.学会使用11大Java开源中文分词器 2.对比分析11大Java开源中文分词器的分词效果本文给出了11大Java开源中文分词的使用方法以及分词结果对比代码,至于效果哪个好,那 ...
Elasticsearch系列---使用中文分词器
前言前面的案例使用standard.english分词器,是英文原生的分词器,对中文分词支持不太好.中文作为全球最优美.最复杂的语言,目前中文分词器较多,ik-analyzer.结巴中文分词.THU ...
ElasticSearch7.3学习(十五)----中文分词器(IK Analyzer)及自定义词库
1. 中文分词器 1.1 默认分词器先来看看ElasticSearch中默认的standard 分词器,对英文比较友好,但是对于中文来说就是按照字符拆分,不是那么友好. GET /_analyze ...

随机推荐

Xilinx ISE 14.1利用Verilog产生clock
<一>建立如下的Verilog Module module myClock( input clock ); endmodule <二>建立 Verilog Test Fixtu ...
JsDoc脚本注释文档生成
使用jsDoc可使用特定注释,将注释的内容生成文档,可用于生成脚本库的API文档 jsdoc 文档: http://usejsdoc.org/
单元测试框架 unittest 的运行方法if __name__ == '__main__': unittest.main()
1. if __name__ == '__main__': unittest.main()2. 测试用例实例根据测试的特点分组在一起. unittest为此提供了一个机制:测试套件由unittest' ...
python实现查有道词典
因为要考英语四级,所以我今天一大早就起来被英语单词,但是作为英语渣渣的我,只能是在网页上挨个查单词的意思.查的多了,心生厌倦,便想着如何才能在终端下查单词,那样速度不就很快了? NOW,我仔细观察每次 ...
一脚踏进Memcached的大门
Memcached 是一个高性能的分布式内存对象缓存系统,用于动态Web应用以减轻数据库负载.它通过在内存中缓存数据和对象来减少读取数据库的次数,从而提高动态.数据库驱动网站的速度.Memcached ...
Java微信公众平台开发_07_JSSDK图片上传
一.本节要点 1.获取jsapi_ticket //2.获取getJsapiTicket的接口地址,有效期为7200秒 private static final String GET_JSAPITIC ...
linux下mysql启动出错
1.刚安装完就启动出错,是因为没有开msql服务,开启即可,service mysql start 2.MySQL: mysql is not running but lock exists rm / ...
Node.js初探之POST方式传输
小知识:POST比GET传输的数据量大很多 POST发数据--"分段" 实例: 准备一个form.html文件: <!DOCTYPE html> <html> ...
C# 可空引用类型
可空引用类型是C#8.0计划新增的一个功能,不过已经发布了预览版本,今天我们来体验一下可空引用类型. 安装您必须下载Visual Studio 2017 15.5预览版(目前最新发布版本是15.4) ...
初生牛犊不怕虎 golang入坑系列
读前必读,下面所有内容都是来自这里. 放到这里的目的,就是为了比对一下,哪里的读者多.平心而论,同样的Markdown,博客园排版真心X看,怎么瞅怎么X看.(X := '难' || X :='耐' | ...