0
我想從文本文件中獲取url列表,看看它們是否已存儲在elasticsearch中。這裏是代碼:Python Elasticsearch:使用來自search_exists的響應
import fileinput
import sys
import urllib2
import os
from urlparse import urlparse
from elasticsearch import Elasticsearch
es = Elasticsearch()
for line_number, line in enumerate(fileinput.input('bangersandmash_items.csv', inplace=1)):
if len(line) > 4:
sys.stdout.write(line)
#open file to load URLs
with open('bangersandmash_items.csv') as urls:
for line in urls:
#strip out http:// as this seems to cause elasticsearch to return no results
url = line.rstrip()
prefix = 'http://'
if url.startswith(prefix):
url = url[len(prefix):]
#query elasticsearch to see if url already exists in library's 'link' fied
response = es.search_exists(index="websearch", doc_type="site", body={"query": {"match_phrase": {"link": url}}}, ignore=[400, 404])
print url
print response
#Is url in library?
if response == "{u'exists': true}":
print url
print "bingo!"
else:
print url
print "nuthin."
它打印出第19-22行格式的url,但它似乎不處理錯誤代碼。第25行和第26行輸出URL和彈性搜索的響應。第28-33行似乎沒有正確處理這些信息。有什麼想法,我在做什麼錯在這裏?