2015-04-06 26 views
6

我有以下格式的elasticsearch文檔。我需要部分更新「x」字段並在其中添加一個python字典。elasticsearch用python進行部分更新

{ 
     "_index": "gdata34", 
     "_type": "gdat", 
     "_id": "328091-72341-118", 
     "_version": 1, 
     "_score": 1, 
     "_source": { 
      "d": { 
       "Thursday": { 
        "s": "" 
       }, 
       "Email": { 
        "s": "" 
       }, 
       "Country": { 
        "s": "US" 
       }, 

      }, 
      "x": { 
       "Geo": { 
        "s": "45.335428,-118.057133", 
        "g": [ 
         -118.057133 
         , 
         45.335428 
        ] 
       } 
      }, 
      } 
     } 

我嘗試下面的代碼更新:

from elasticsearch import Elasticsearch, exceptions 
import pprint 


elasticsearch = Elasticsearch() 
doc = elasticsearch.get(index='gdata34', doc_type='gdat', id='328091-72341-7') 

elasticsearch.update(index='gdata34', doc_type='gdat', id='328091-72341-7', 
        body={"script":"ctx._source.x += y", 
          "params":{"y":"z"} 
        } 
        ) 
elasticsearch.indices.refresh(index='gdata34') 
new_doc = elasticsearch.get(index='gdata34', doc_type='gdat', id='328091-72341-7') 

我收到此錯誤:

elasticsearch.exceptions.RequestError: TransportError(400, u'ElasticsearchIllegalArgumentException[failed to execute script]; nested: ScriptException[dynamic scripting for [groovy] disabled]; ') 

什麼是使用Python做elasticsearch部分更新的正確方法?

+0

您使用的是什麼版本的ES? – 2015-04-06 10:42:05

+0

@LukasGraf 1.4.4 – Anish 2015-04-06 10:43:59

回答

9

爲了將來的參考,部分更新的以下方法工作。

elasticsearch.update(index='gdata34', doc_type='gdat', id='328091-72341-7', 
        body={ 
         'doc': {'x': {'y':'z'}} 
        } 
        ) 
1

ElasticSearch docs on scripting

We recommend running Elasticsearch behind an application or proxy, which protects Elasticsearch from the outside world. If users are allowed to run dynamic scripts (even in a search request), then they have the same access to your box as the user that Elasticsearch is running as. For this reason dynamic scripting is allowed only for sandboxed languages by default.

現在,在最近的ES版本出現了在Groovy腳本引擎,允許腳本來逃避沙盒和執行shell命令的脆弱性的錯誤用戶運行Elasticsearch Java虛擬機 - 這就是爲什麼Groovy sandbox is disabled by default in recent versions以及因此在請求正文或.scripts索引中傳遞的Groovy腳本的執行情況。使用此默認配置執行Groovy腳本的唯一方法是將它們放置在節點上的config/scripts/目錄中。

所以,你有兩個選擇:

  • 如果你的ES實例直接訪問並固定在一個代理,你可以通過你的節點上設置在config/elasticsearch.ymlscript.groovy.sandbox.enabled: true重新打開Groovy的沙盒(S )。如果您的ES實例可通過您的
  • 訪問您可以準備腳本並將其放置在節點的config/scripts目錄中的文件系統上,然後按名稱調用它。詳情請參閱Running Groovy Scripts without Dynamic Scripting
+0

我們是否可以使用以下?elasticsearch.update(index ='gdata34',doc_type ='gdat',id ='328091-72341-7', body = { 'doc':{' x':{'y':'z'}} } ) – Anish 2015-04-06 11:02:21

+1

是的,但請務必閱讀[documentation](http://www.elastic.co/guide/en/elasticsearch/reference/1.4/) docs-update.html) - 除非指定'「detect_noop」:true',否則這將始終導致文檔被更新,即使合併過程未檢測到任何更改。 – 2015-04-06 11:10:14

+0

至少在ElasticSearch 2.3(目前的最新版本)中,默認情況下啓用「detect_noop」。 – 2016-06-12 20:14:05

相關問題