Django之站内搜索-Solr,Haystack
java -version 不多说 solr 是java 开发的
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
Download Solr:
http://archive.apache.org/dist/lucene/solr/
下载--》解压--》solr-4.10.4/example
SimilarFacedeMacBook-Pro:example similarface$ java -jar start.jar
长长的log我也是醉了

SimilarFacedeMacBook-Pro:solr similarface$ pwd
/Users/similarface/Downloads/solr-4.10.4/example/solr
SimilarFacedeMacBook-Pro:solr similarface$ tree newblog/
newblog/
├── conf
│ ├── _rest_managed.json
│ ├── lang
│ │ └── stopwords_en.txt
│ ├── protwords.txt
│ ├── schema.xml
│ ├── solrconfig.xml
│ ├── stopwords.txt
│ └── synonyms.txt
├── core.properties
└── data SimilarFacedeMacBook-Pro:solr similarface$ cat newblog/conf/solrconfig.xml
<?xml version="1.0" encoding="utf-8" ?>
<config>
<luceneMatchVersion>LUCENE_36</luceneMatchVersion>
<requestHandler name="/select" class="solr.StandardRequestHandler"
default="true"/>
<requestHandler name="/update" class="solr.UpdateRequestHandler"/>
<requestHandler name="/admin" class="solr.admin.AdminHandlers"/>
<requestHandler name="/admin/ping" class="solr.PingRequestHandler">
<lst name="invariants">
<str name="qt">search</str>
<str name="q">*:*</str>
</lst>
</requestHandler>
</config>
schema.xml默认写下面的:
<?xml version="1.0" ?>
<schema name="default" version="1.5">
</schema>
上面怎么做,bitch ,mkdir and vim


pip install django-haystack==2.4.0
pip install pysolr==3.3.2
settings.py
[haystack ,HAYSTACK_CONNECTIONS]
INSTALLED_APPS = [
...
'haystack',
...
] HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.solr_backend.SolrEngine',
'URL': 'http://127.0.0.1:8983/solr/myblog'
},
}
#coding:utf-8
__author__ = 'similarface' from haystack import indexes
from .models import Post
class PostIndex(indexes.SearchIndex,indexes.Indexable):
'''
给文章Post 建立索引
'''
text = indexes.CharField(document=True, use_template=True)
publish = indexes.DateTimeField(model_attr='publish')
def get_model(self):
return Post
def index_queryset(self, using=None):
return self.get_model().published.all()
SimilarFacedeMacBook-Pro:templates similarface$ tree search/
search/
└── indexes
└── myblog
└── post_text.txt
SimilarFacedeMacBook-Pro:templates similarface$ pwd
/Users/similarface/PycharmProjects/StudyDjango/myblog/templates
SimilarFacedeMacBook-Pro:templates similarface$ cat search/indexes/myblog/post_text.txt
{{ object.title }}
{{ object.tags.all|join:", " }}
{{ object.content }}
SimilarFacedeMacBook-Pro:StudyDjango similarface$ python manage.py build_solr_schema
/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pytz/__init__.py:29: UserWarning: Module email was already imported from /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/email/__init__.pyc, but /Library/Python/2.7/site-packages is being added to sys.path
from pkg_resources import resource_stream
/Library/Python/2.7/site-packages/django/core/management/base.py:265: RemovedInDjango110Warning: OptionParser usage for Django management commands is deprecated, use ArgumentParser instead
RemovedInDjango110Warning) /Library/Python/2.7/site-packages/haystack/management/commands/build_solr_schema.py:56: RemovedInDjango110Warning: render() must be called with a dict, not a Context.
return t.render(c) Save the following output to 'schema.xml' and place it in your Solr configuration directory.
-------------------------------------------------------------------------------------------- <?xml version="1.0" ?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
--> <schema name="default" version="1.5">
<types>
<fieldtype name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
<fieldtype name="binary" class="solr.BinaryField"/> <!-- Numeric field types that manipulate the value into
a string value that isn't human-readable in its internal form,
but with a lexicographic ordering the same as the numeric ordering,
so that range queries work correctly. -->
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/> <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/> <fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0" positionIncrementGap="0"/>
<!-- A Trie based date field for faster date range queries and date faceting. -->
<fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/> <fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/>
<fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
<fieldtype name="geohash" class="solr.GeoHashField"/> <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType> <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="lang/stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:
<filter class="solr.EnglishMinimalStemFilterFactory"/>
-->
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="lang/stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory:
<filter class="solr.EnglishMinimalStemFilterFactory"/>
-->
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType> <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType> <fieldType name="ngram" class="solr.TextField" >
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="15" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType> <fieldType name="edge_ngram" class="solr.TextField" positionIncrementGap="1">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
</analyzer>
</fieldType>
</types> <fields>
<!-- general -->
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="django_ct" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="django_id" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="_version_" type="long" indexed="true" stored ="true"/> <dynamicField name="*_i" type="int" indexed="true" stored="true"/>
<dynamicField name="*_s" type="string" indexed="true" stored="true"/>
<dynamicField name="*_l" type="long" indexed="true" stored="true"/>
<dynamicField name="*_t" type="text_en" indexed="true" stored="true"/>
<dynamicField name="*_b" type="boolean" indexed="true" stored="true"/>
<dynamicField name="*_f" type="float" indexed="true" stored="true"/>
<dynamicField name="*_d" type="double" indexed="true" stored="true"/>
<dynamicField name="*_dt" type="date" indexed="true" stored="true"/>
<dynamicField name="*_p" type="location" indexed="true" stored="true"/>
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/> <field name="text" type="text_en" indexed="true" stored="true" multiValued="false" /> <field name="publish" type="date" indexed="true" stored="true" multiValued="false" /> </fields> <!-- field to use to determine and enforce document uniqueness. -->
<uniqueKey>id</uniqueKey> <!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>text</defaultSearchField> <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="AND"/>
</schema> #Save the following output to 'schema.xml' and place it in your Solr configuration directory. vi ~/Downloads/solr-4.10.4/example/solr/myblog/conf/schema.xml

SimilarFacedeMacBook-Pro:StudyDjango similarface$ python manage.py rebuild_index
from pkg_resources import resource_stream
/Library/Python/2.7/site-packages/django/core/management/base.py:265: RemovedInDjango110Warning: OptionParser usage for Django management commands is deprecated, use ArgumentParser instead
RemovedInDjango110Warning) WARNING: This will irreparably remove EVERYTHING from your search index in connection 'default'.
Your choices after this are to restore from backups or rebuild via the `rebuild_index` command.
Are you sure you wish to continue? [y/N] y
Removing all documents from your index because you said so.
Failed to clear Solr index: [Reason: Error 404 Not Found]
All documents removed.
Indexing 8 posts
/Library/Python/2.7/site-packages/haystack/fields.py:137: RemovedInDjango110Warning: render() must be called with a dict, not a Context.
return t.render(Context({'object': obj})) Failed to add documents to Solr: [Reason: Error 404 Not Found]


Django之站内搜索-Solr,Haystack的更多相关文章
- 利用Solr服务建立的站内搜索雏形---solr1
最近看完nutch后总感觉像好好捯饬下solr,上次看到老大给我展现了下站内搜索我便久久不能忘怀.总觉着之前搭建的nutch配上solr还是有点呆板,在nutch爬取的时候就建立索引到solr服务下, ...
- 利用Solr服务建立的站内搜索雏形
最近看完nutch后总感觉像好好捯饬下solr,上次看到老大给我展现了下站内搜索我便久久不能忘怀.总觉着之前搭建的nutch配上solr还是有点呆板,在nutch爬取的时候就建立索引到solr服务下, ...
- 一步步开发自己的博客 .NET版(5、Lucenne.Net 和 必应站内搜索)
前言 这次开发的博客主要功能或特点: 第一:可以兼容各终端,特别是手机端. 第二:到时会用到大量html5,炫啊. 第三:导入博客园的精华文章,并做分类.(不要封我) 第四:做 ...
- Lucene.net站内搜索—6、站内搜索第二版
目录 Lucene.net站内搜索—1.SEO优化 Lucene.net站内搜索—2.Lucene.Net简介和分词Lucene.net站内搜索—3.最简单搜索引擎代码Lucene.net站内搜索—4 ...
- Lucene.net站内搜索—5、搜索引擎第一版实现
目录 Lucene.net站内搜索—1.SEO优化 Lucene.net站内搜索—2.Lucene.Net简介和分词Lucene.net站内搜索—3.最简单搜索引擎代码Lucene.net站内搜索—4 ...
- Lucene.net站内搜索—4、搜索引擎第一版技术储备(简单介绍Log4Net、生产者消费者模式)
目录 Lucene.net站内搜索—1.SEO优化 Lucene.net站内搜索—2.Lucene.Net简介和分词Lucene.net站内搜索—3.最简单搜索引擎代码Lucene.net站内搜索—4 ...
- Lucene.net站内搜索—3、最简单搜索引擎代码
目录 Lucene.net站内搜索—1.SEO优化 Lucene.net站内搜索—2.Lucene.Net简介和分词Lucene.net站内搜索—3.最简单搜索引擎代码Lucene.net站内搜索—4 ...
- Lucene.net站内搜索—2、Lucene.Net简介和分词
目录 Lucene.net站内搜索—1.SEO优化 Lucene.net站内搜索—2.Lucene.Net简介和分词Lucene.net站内搜索—3.最简单搜索引擎代码Lucene.net站内搜索—4 ...
- Lucene.net站内搜索—1、SEO优化
目录 Lucene.net站内搜索—1.SEO优化 Lucene.net站内搜索—2.Lucene.Net简介和分词Lucene.net站内搜索—3.最简单搜索引擎代码Lucene.net站内搜索—4 ...
随机推荐
- FOJ Problem 2273 Triangles
Problem 2273 Triangles Accept: 201 Submit: 661Time Limit: 1000 mSec Memory Limit : 262144 KB P ...
- TextBox取不到值及其TextBox取不到js赋的值
原文发布时间为:2009-10-22 -- 来源于本人的百度文章 [由搬家工具导入] 原因:使用了一个只读的TextBox控件 曾经遇到过这样的问题:使用了一个只读的TextBox控件,但是在后台代码 ...
- bugs view:
Expecially those business bugs! I should check better especially when data changes! This place requi ...
- usb 2.0 支援的速度
from http://www.usb.org/developers/docs/usb20_docs/ high speed : 480 Mb/s full speed : 12 Mb/s low s ...
- git的使用学习(九)搭建git服务器
搭建Git服务器 在远程仓库一节中,我们讲了远程仓库实际上和本地仓库没啥不同,纯粹为了7x24小时开机并交换大家的修改. GitHub就是一个免费托管开源代码的远程仓库.但是对于某些视源代码如生命的商 ...
- Codeforces Gym100971 L.Chess Match (IX Samara Regional Intercollegiate Programming Contest Russia, Samara, March 13)
这个题就是两个队,看最多能赢的个数,然后比较一下,看两个队是都能赢彼此,还是只有一个队赢的可能性最大.表达能力不好,意思差不多... 和田忌赛马有点像,emnnn,嗯. 代码: 1 #include& ...
- LINUX CP命令直接覆盖不提示按Y/N的方法
refer to: https://blog.csdn.net/qq_36741436/article/details/78732201 cp覆盖时,无论加什么参数-f之类的还是提示是否覆盖,当文件比 ...
- 首次尝试Flink的一些感受
最近打算研究研究 Flink,根据官方文档写个 Hello,World.入门还是比较容易的,不需要复杂的安装环境.配置.这篇文章简单介绍 Flink 的使用感受以及入门. 感受 搭建环境方便:Flin ...
- Graphs (Cakewalk) 1 B - medium
Discription Bear Limak examines a social network. Its main functionality is that two members can bec ...
- HDU4004
题目大意,有一条长度为L和河流,中间穿插n个石凳,青蛙跳m次经过石凳后到达对岸,求青蛙每次跳跃的最大距离的最小值 本题数据量大n<500000,显然简单的o(n*n)算法是通过不了的,在输入大量 ...