elasticsearch蟒蛇大宗原料藥（elasticsearch-PY）

我感到困惑PY-elasticsearch散裝 @Diolor解決方案工作 https://stackoverflow.com/questions/20288770/how-to-use-bulk-api-to-store-the-keywords-in-es-by-using-python，但我想用純es.bulk（）elasticsearch蟒蛇大宗原料藥（elasticsearch-PY）

我的代碼：

from elasticsearch import Elasticsearch 
es = Elasticsearch() 
doc = '''\n {"host":"logsqa","path":"/logs","message":"test test","@timestamp":"2014-10-02T10:11:25.980256","tags":["multiline","mydate_0.005"]} \n''' 
result = es.bulk(index="logstash-test", doc_type="test", body=doc)

的錯誤是：

No handlers could be found for logger "elasticsearch" 
Traceback (most recent call last): 
    File "./log-parser-perf.py", line 55, in <module> 
    insertToES() 
    File "./log-parser-perf.py", line 46, in insertToES 
    res = es.bulk(index="logstash-test", doc_type="test", body=doc) 
    File "/usr/local/lib/python2.7/dist-packages/elasticsearch-1.0.0-py2.7.egg/elasticsearch/client/utils.py", line 70, in _wrapped 
    return func(*args, params=params, **kwargs) 
    File "/usr/local/lib/python2.7/dist-packages/elasticsearch-1.0.0-py2.7.egg/elasticsearch/client/__init__.py", line 570, in bulk 
    params=params, body=self._bulk_body(body)) 
    File "/usr/local/lib/python2.7/dist-packages/elasticsearch-1.0.0-py2.7.egg/elasticsearch/transport.py", line 274, in perform_request 
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore) 
    File "/usr/local/lib/python2.7/dist-packages/elasticsearch-1.0.0-py2.7.egg/elasticsearch/connection/http_urllib3.py", line 57, in perform_request 
    self._raise_error(response.status, raw_data) 
    File "/usr/local/lib/python2.7/dist-packages/elasticsearch-1.0.0-py2.7.egg/elasticsearch/connection/base.py", line 83, in _raise_error 
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) 
elasticsearch.exceptions.TransportError: TransportError(500, u'ActionRequestValidationException[Validation Failed: 1: no requests added;]')

生成的URL的POST調用

/logstash測試/檢驗/ _bulk

和POST體是：

{ 「主機」：「logsqa」，「路徑」：「/日誌」，「消息「：」測試測試」，「@時間戳」：「2014-10-02T10：11：25.980256」，「標籤」：「多」，「mydate_0.005」]}

所以，我沒有che手卷曲：此捲曲不起作用：

> curl -XPUT http://localhost:9200/logstash-test/test2/_bulk -d 
> '{"host":"logsqa","path":"/logs","message":"test 
> test","@timestamp":"2014-10-02T10:11:25.980256","tags":["multiline","mydate_0.005"]} 
> ' 
> 
> {"error":"ActionRequestValidationException[Validation Failed: 1: no requests added;]","status":500}

所以這個錯誤部分沒問題，但我確實期望elasticsearch.bulk（）能正確管理輸入參數。

的pythonf功能是：

bulk(*args, **kwargs) 
    :arg body: The operation definition and data (action-data pairs), as 
     either a newline separated string, or a sequence of dicts to 
     serialize (one per row). 
    :arg index: Default index for items which don't provide one 
    :arg doc_type: Default document type for items which don't provide one 
     :arg consistency: Explicit write consistency setting for the operation 
    :arg refresh: Refresh the index after performing the operation 
    :arg routing: Specific routing value 
    :arg replication: Explicitly set the replication type (default: sync) 
    :arg timeout: Explicit operation timeout

來源

2014-10-02 sirkubax

我會建議你使用'幫手.bulk（）'除非你想做更復雜的事情。你可以閱讀helpers.bulk的源代碼[這裏]（https://github.com/elasticsearch/elasticsearch-py/blob/master/elasticsearch/helpers/__init__.py），並且實現你自己的願望。 helpers.bulk包裝helpers.streaming_bulk，最後包裝es.bulk。 – Diolor 2014-10-06 18:09:56

從@HonzaKral在github

https://github.com/elasticsearch/elasticsearch-py/issues/135

嗨sirkubax，

大宗原料藥（像所有其他人）如下非常密切彈性搜索本身的批量API格式，所以身體必須是：

doc ='''{「index」：{}} \ n {「host」：「logsqa」，「path」：「/ logs」，「message」：「test test」，「@ timestamp」 2014-10-02T10：11：25.980256「，」tags「：[」multiline「，」mydate_0.005「]} \ n'''' 。或者它可以是這兩個字典的列表。

這是一個複雜和笨拙的格式與Python工作，這就是爲什麼我試圖創建一個更方便的方式來處理與elasticsearch.helpers.bulk（0）中的批量。它只是接受文檔的迭代器，會從中提取任何可選的元數據（如_id，_type等）併爲您構建（並執行）批量請求。有關接受的格式的更多信息，請參閱上面的streaming_bulk的文檔，它是以迭代方式（從用戶點一次一個，在後臺以批處理形式批處理）處理流的助手。

希望這會有所幫助。

0 - http://elasticsearch-py.readthedocs.org/en/master/helpers.html#elasticsearch.helpers.bulk

來源

2014-10-03 05:41:37 sirkubax

你可能會添加一個工作示例嗎？對我來說，還是有點不清楚大容量查詢的確切語法。 – egpbos 2015-05-16 13:09:52

萬一有人正在試圖用大宗原料藥，不知道的格式應該是什麼，這裏是爲我工作：

doc = [ 
    { 
     'index':{ 
      '_index': index_name, 
      '_id' : <some_id>, 
      '_type':<doc_type> 
     } 
    }, 
    { 
     'field_1': <value>, 
     'field_2': <value> 
    } 
] 

docs_as_string = json.dumps(doc[0]) + '\n' + json.dumps(doc[1]) + '\n' 
client.bulk(body=docs_as_string)

來源

2016-05-05 20:26:08

elasticsearch蟒蛇大宗原料藥（elasticsearch-PY）

回答

相關問題