2016-04-21 33 views
14

我有一個JSON文件,我需要在ElasticSearch服務器上對其進行索引。驗證失敗:1:在批量索引中沒有添加請求ElasticSearch

JSOIN文件看起來像這樣:

{ 
    "sku": "1", 
    "vbid": "1", 
    "created": "Sun, 05 Oct 2014 03:35:58 +0000", 
    "updated": "Sun, 06 Mar 2016 12:44:48 +0000", 
    "type": "Single", 
    "downloadable-duration": "perpetual", 
    "online-duration": "365 days", 
    "book-format": "ePub", 
    "build-status": "In Inventory", 
    "description": "On 7 August 1914, a week before the Battle of Tannenburg and two weeks before the Battle of the Marne, the French army attacked the Germans at Mulhouse in Alsace. Their objective was to recapture territory which had been lost after the Franco-Prussian War of 1870-71, which made it a matter of pride for the French. However, after initial success in capturing Mulhouse, the Germans were able to reinforce more quickly, and drove them back within three days. After forty-three years of peace, this was the first test of strength between France and Germany. In 1929 Karl Deuringer wrote the official history of the battle for the Bavarian Army, an immensely detailed work of 890 pages; First World War expert and former army officer Terence Zuber has translated this study and edited it down to more accessible length, to produce the first account in English of the first major battle of the First World War.", 
    "publication-date": "07/2014", 
    "author": "Deuringer, Karl", 
    "title": "The First Battle of the First World War: Alsace-Lorraine", 
    "sort-title": "First Battle of the First World War: Alsace-Lorraine", 
    "edition": "0", 
    "sampleable": "false", 
    "page-count": "0", 
    "print-drm-text": "This title will only allow printing of 2 consecutive pages at a time.", 
    "copy-drm-text": "This title will only allow copying of 2 consecutive pages at a time.", 
    "kind": "book", 
    "fro": "false", 
    "distributable": "true", 
    "subjects": { 
     "subject": [ 
     { 
      "-schema": "bisac", 
      "-code": "HIS027090", 
      "#text": "World War I" 
     }, 
     { 
      "-schema": "coursesmart", 
      "-code": "cs.soc_sci.hist.milit_hist", 
      "#text": "Social Sciences -> History -> Military History" 
     } 
     ] 
    }, 
    "pricelist": { 
     "publisher-list-price": "0.0", 
     "digital-list-price": "7.28" 
    }, 
    "publisher": { 
     "publisher-name": "The History Press", 
     "imprint-name": "The History Press Ireland" 
    }, 
    "aliases": { 
     "eisbn-canonical": "1", 
     "isbn-canonical": "1", 
     "print-isbn-canonical": "9780752460864", 
     "isbn13": "1", 
     "isbn10": "0750951796", 
     "additional-isbns": { 
     "isbn": [ 
      { 
      "-type": "print-isbn-10", 
      "#text": "0752460862" 
      }, 
      { 
      "-type": "print-isbn-13", 
      "#text": "97807524608" 
      } 
     ] 
     } 
    }, 
    "owner": { 
     "company": { 
     "id": "1893", 
     "name": "The History Press" 
     } 
    }, 
    "distributor": { 
     "company": { 
     "id": "3658", 
     "name": "asc" 
     } 
    } 
    } 

但是,當我試圖索引使用此JSON文件命令

curl -XPOST 'http://localhost:9200/_bulk' -d @1.json 

我得到這個錯誤:

{"error":{"root_cause":[{"type":"action_request_validation_exception","reason":"Validation Failed: 1: no requests added;"}],"type":"action_request_validation_exception","reason":"Validation Failed: 1: no requests added;"},"status":400} 

我不不知道我犯了什麼錯誤。

回答

23

Elasticsearch的批量API使用一種特殊的語法,它實際上由單行寫成的json文檔組成。看看documentation

語法很簡單。索引,創建和更新你需要2個單行json文檔。第一行告訴動作,第二行給文檔索引/創建/更新。要刪除文檔,只需要操作行。例如(從文件):

{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } } 
{ "field1" : "value1" } 
{ "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } } 
{ "field1" : "value3" } 
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "index1"} } 
{ "doc" : {"field2" : "value2"} } 
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } } 

不要忘了結束您的文件用一個新行。 然後,打電話給大宗原料藥使用命令:

curl -s -XPOST localhost:9200/_bulk --data-binary "@requests" 

從文檔:

If you’re providing text file input to curl, you must use the --data-binary flag instead of plain -d

+1

」別忘了用新行結束文件。「 謝謝!在這裏爲我節省了一個小時。 –

+6

**不要忘記用新的行結束你的文件** ..在早上發誓在筆記本電腦上,3你保存了我的生活大聲笑.. –

+0

此外,愚蠢的事情......不,我會做到這一點... 該文檔稱 '--data二進制「@requests」' 的'@'必須是你的文件名之前,如果你忘記它也會失敗。 –

0

我曾在一個類似的問題,我想刪除特定類型的特定文件,通過上面的回答,我設法讓我的簡單的bash腳本終於工作了!

我有一個文件,每行都有一個document_id(document_id.txt),並使用下面的bash腳本,我可以用提到的document_id's刪除某個類型的文檔。

這是文件的樣子:

c476ce18803d7ed3708f6340fdfa34525b20ee90 
5131a30a6316f221fe420d2d3c0017a76643bccd 
08ebca52025ad1c81581a018febbe57b1e3ca3cd 
496ff829c736aa311e2e749cec0df49b5a37f796 
87c4101cb10d3404028f83af1ce470a58744b75c 
37f0daf7be27cf081e491dd445558719e4dedba1 

的bash腳本是這樣的:

#!/bin/bash 

es_cluster="http://localhost:9200" 
index="some-index" 
doc_type="some-document-type" 

for doc_id in `cat document_id.txt` 
do 
    request_string="{\"delete\" : { \"_type\" : \"${doc_type}\", \"_id\" : \"${doc_id}\" } }" 
    echo -e "${request_string}\r\n\r\n" | curl -s -XPOST "${es_cluster}/${index}/${doc_type}/_bulk" --data-binary @- 
    echo 
done 

這一招,很多的挫折之後,是使用-e選項在我將其捲入捲曲之前,回顯並追加\ n \ n到echo的輸出。

然後在捲曲然後我有--data二進制選項設置來阻止它剝出\ n \ n所需_bulk端點其次@ -選項來獲得它從標準輸入讀取! 「

相關問題