2015-06-23 25 views
1

我想索引我的項目在ElasticSearch中,我找到了thisscrapy不將數據導出到彈性搜索

但是,如果我試圖抓取網站,我收到以下錯誤:

File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 577, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/usr/local/lib/python2.7/dist-packages/scrapyelasticsearch/scrapyelasticsearch.py", line 70, in process_item self.index_item(item) File "/usr/local/lib/python2.7/dist-packages/scrapyelasticsearch/scrapyelasticsearch.py", line 52, in index_item local_id = hashlib.sha1(item[uniq_key]).hexdigest() File "/home/javed/.local/lib/python2.7/site-packages/scrapy/item.py", line 50, in getitem return self._values[key] exceptions.KeyError: 'url'

回答

2

既然你沒貼你的蜘蛛的代碼,我只能假設的東西。 一個假設是你沒有在你的項目中設置必填項。他們需要在ELASTICSEARCH_UNIQ_KEY中指定一個字段,並且它必須是唯一的。最簡單的事情可能是使用了url

# somewhere deep in your callback, 
# where you create and yield your item 
... 
myitem['url'] = response.url 
return myitem 

,並確保在settings.py設置:

ELASTICSEARCH_UNIQ_KEY = 'url' 
+0

Thanx a Lot Lawrence..You're awesome .. – Javed

0

我只是評論在我的settings.py文件這一領域(該領域根據可選的是到official documentation

#ELASTICSEARCH_UNIQ_KEY = 'url' # Custom unique key