2015-08-21 114 views
1

我正在使用elasticsearch-py進行彈性搜索操作。elasticsearch使用python創建或更新文檔

我在嘗試使用elasticsearch.helpers.bulk來創建或更新多個記錄。

from elasticsearch import Elasticsearch 
from elasticsearch import helpers 
es = Elasticsearch() 

data = [ 
    { 
     "_index": "customer", 
     "_type": "external", 
     "_op_type": "create", 
     "_id": 3, 
     "doc" : {"name": "test"} 
    }, 
    { 
     "_index": "customer", 
     "_type": "external", 
     "_op_type": "create", 
     "_id": 4, 
     "doc" : {"name": "test"} 
    }, 
    { 
     "_index": "customer", 
     "_type": "external", 
     "_op_type": "create", 
     "_id": 5, 
     "doc" : {"name": "test"} 
    }, 
    { 
     "_index": "customer", 
     "_type": "external", 
     "_op_type": "create", 
     "_id": 6, 
     "doc" : {"name": "test"} 
    }, 
] 


print helpers.bulk(es, data) 

是否有任何方法可以執行此操作?

現在我們只能給_op_type作爲createupdate。如果我們給update並且記錄不存在,那麼它會引發錯誤。

Traceback (most recent call last): 
    File "/tmp/test.py", line 37, in <module> 
    print helpers.bulk(es, data) 
    File "/local/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 182, in bulk 
    for ok, item in streaming_bulk(client, actions, **kwargs): 
    File "/local/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 155, in streaming_bulk 
    raise BulkIndexError('%i document(s) failed to index.' % len(errors), errors) 
elasticsearch.helpers.BulkIndexError: ('4 document(s) failed to index.', [{u'update': {u'status': 404, u'_type': u'external', u'_id': u'3', u'error': u'DocumentMissingException[[customer][-1] [external][3]: document missing]', u'_index': u'customer'}}, {u'update': {u'status': 404, u'_type': u'external', u'_id': u'4', u'error': u'DocumentMissingException[[customer][-1] [external][4]: document missing]', u'_index': u'customer'}}, {u'update': {u'status': 404, u'_type': u'external', u'_id': u'5', u'error': u'DocumentMissingException[[customer][-1] [external][5]: document missing]', u'_index': u'customer'}}, {u'update': {u'status': 404, u'_type': u'external', u'_id': u'6', u'error': u'DocumentMissingException[[customer][-1] [external][6]: document missing]', u'_index': u'customer'}}]) 
+1

你試過用'index'作爲op_type而不是'create'和'update'嗎? – Val

+0

@Val,根據'helpers.bulk'文件,我們必須給'index',我也試過你的解決方案,它給出'ValidationError','elasticsearch.exceptions.TransportError:TransportError(500,u'ActionRequestValidationException [Validation Failed :1:沒有添加任何請求;]')' – Nilesh

+0

這很奇怪...你確定你有''_op_type「:」index「'? – Val

回答

2

按照_bulk endpoint文檔,你可以和應該使用這個index行動,提供您的文檔始終具有相同的標識符。

create在第一次創建文檔時很有用,而update更適合做部分和/或腳本更新。

您也可以根本不指定任何_op_type,並且index將默認採用。

2

我嘗試了@Val建議的解決方案,它用作魅力。

from elasticsearch import Elasticsearch 
from elasticsearch import helpers 
es = Elasticsearch() 

data = [ 
    { 
     "_index": "customer", 
     "_type": "external", 
     "_id": 3, 
     "doc" : {"name": "test"} 
    }, 
    { 
     "_index": "customer", 
     "_type": "external", 
     "_id": 4, 
     "doc" : {"name": "test"} 
    }, 
    { 
     "_index": "customer", 
     "_type": "external", 
     "_id": 5, 
     "doc" : {"name": "test"} 
    }, 
    { 
     "_index": "customer", 
     "_type": "external", 
     "_id": 6, 
     "doc" : {"name": "test"} 
    }, 
] 


print helpers.bulk(es, data) 
相關問題