2017-04-21 25 views
0

我想在我的簡單的數據結構Django的環境中創建一個搜索引擎:的Django如何提高速度草垛搜索

| id   | comapany name | 
|:-----------|-----------------:| 
| 12345678 | company A's name | 
| 12345687 | peoples pizza a/s| 
| 87654321 | sub's for pugs | 

將有大約公司,我只是想通過搜索名稱。 當找到名字時,我的django中會返回ID。

我試着大海撈針,嗖等,但我不斷收到很慢搜索結果中的各種設置窗口,因爲我從500〜我的測試數據集80萬提高。 搜索有時需要將近一個小時

我使用的是PaaS的Heroku的,所以我想我會嘗試一個集成的付費服務(searly的elasticsearch實現)。這有所幫助,但是當我到達大約8萬家公司時,它又開始變得非常緩慢。

已安裝的應用

INSTALLED_APPS = [ 
    'django.contrib.admin', 
    'django.contrib.auth', 
    'django.contrib.contenttypes', 
    'django.contrib.sessions', 
    'django.contrib.sites', 

    # Added. 
    'haystack', 

    # Then your usual apps... 
] 

更多settings.py

import os 
from urlparse import urlparse 

es = urlparse(os.environ.get('SEARCHBOX_URL') or 'http://127.0.0.1:9200/') 

port = es.port or 80 

HAYSTACK_CONNECTIONS = { 
    'default': { 
     'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine', 
     'URL': es.scheme + '://' + es.hostname + ':' + str(port), 
     'INDEX_NAME': 'documents', 
    }, 


if es.username: 
    HAYSTACK_CONNECTIONS['default']['KWARGS'] = {"http_auth": es.username + ':' + es.password} 

search_indexes.py

from haystack import indexes 

from hello.models import Article 


class ArticleIndex(indexes.SearchIndex, indexes.Indexable): 
    ''' 
    defines the model for the serach Engine. 
    ''' 
    text = indexes.CharField(document=True, use_template=True) 
    pub_date = indexes.DateTimeField(model_attr='pub_date') 
    # pub_date line was commented out previously 
    content_auto = indexes.EdgeNgramField(model_attr='title') 

    def get_model(self): 
     return Article 

    def index_queryset(self, using=None): 
     """Used when the entire index for model is updated.""" 
     return self.get_model().objects.all() 

article_text.txt

{{ object.title }} 
{{ object.user.get_full_name }} 
{{ object.body }} 

urls.py

url(r'^search/$', views.search_titles, name='search'), 

views.py

def search_titles(request): 
    txt = request.POST.get('search_text', '') 
    if txt and len(txt) >= 4: 
     articles = SearchQuerySet().autocomplete(content_auto=txt) 
    # if the post request is empty, return nothing 
    # this prevents internal server error with jquery 
    else: 
     articles = [] 
    return render_to_response('scripts/ajax_search.html', 
           {'articles': articles}) 

search.html

{% if articles.count > 0 %} 
    <!-- simply prints the links to the cvr numbers--> 
    <!-- for article in articles --> 
    {% for article in "x"|rjust:"15" %} 
     <li><a href="{{ article.object.get_absolute_url }}">{{ article.object.title }}</a></li> 
    {% endfor %} 

{% else %} 

    <li>Try again, or try CVR + &#x23ce;</li> 

{% endif %} 

的index.html(其中i調用搜索引擎)

{% csrf_token %} 
<input type="text" id="search" name="search" /> 

<!-- This <ul> all company names end up--> 
<ul id ="search-results"></ul> 

回答

0

我改變了我的ves.py搜索方法H中,以:

txt = request.POST.get('search_text', '') 
articles = [] 
suggestedSearchTerm = "" 
if txt and len(txt) >= 4: 
    sqs = SearchQuerySet() 
    sqs.query.set_limits(low=0, high=8) 
    sqs = sqs.filter(content=txt) 
    articles = sqs.query.get_results() 
    suggestedSearchTerm = SearchQuerySet().spelling_suggestion(txt) 
    if suggestedSearchTerm == txt: 
     suggestedSearchTerm = '' 
    else: 
     suggestedSearchTerm = suggestedSearchTerm.lower()