我在使用Elasticsearch Python客戶端時遇到了一個問題。我有一個名爲test.json的文件(有效!)JSON。我現在想要在elasticsearch中索引該JSON。我試過這個little Tutorial來檢查我是否可以連接到我的本地elasticsearch實例,它的工作,所以我相信這個問題是不是在我與elasticsearch連接。Elasticsearch Python客戶端索引JSON
當我跑我的小代碼在這裏:
from elasticsearch import Elasticsearch
import json
es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
with open('test.json') as json_data:
es.index(index='testdata', doc_type='generated', id=1, body=json.load(json_data))
我在我的命令行得到這個異常(mapper_parsing_exception?):
Traceback (most recent call last):
File "app.py", line 13, in <module>
es.index(index='testdata', doc_type='generated', id=1, body=json.load(json_data))
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 300, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 318, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 128, in perform_request
self._raise_error(response.status, raw_data)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 124, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, u'mapper_parsing_exception', u'failed to parse')
你能指出我在賴特方向,什麼可能是問題嗎?
啊,是的,我打印了「json.load(json_data)」螞蟻工作完美,這意味着從文件加載JSON沒有問題。
感謝您的幫助! Greez
更新:
with open('test.json') as json_data:
#d = json.load(json_data)
print(json_data)
es.index(index='testdata', doc_type='generated', id=1, body=json_data)
此代碼也不管用,我甚至不能打印JSON的CL。現在
錯誤:
<open file 'test.json', mode 'r' at 0x7f8329340c00>
Traceback (most recent call last):
File "app.py", line 14, in <module>
es.index(index='testdata', doc_type='generated', id=1, body=json_data)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 300, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 284, in perform_request
body = self.serializer.dumps(body)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/serializer.py", line 50, in dumps
raise SerializationError(data, e)
elasticsearch.exceptions.SerializationError: (<closed file 'test.json', mode 'r' at 0x7f8329340c00>, TypeError("Unable to serialize <open file 'test.json', mode 'r' at 0x7f8329340c00> (type: <type 'file'>)",))
多數民衆贊成在test.json文件(只是一些隨機生成的JSON)的內容:
[
{
"_id": "58ee19e75ffc814d4dff17da",
"index": 0,
"guid": "45476739-80b3-49de-8f00-9923f84f56ce",
"isActive": true,
"balance": "$2,882.08",
"picture": "http://placehold.it/32x32",
"age": 31,
"eyeColor": "blue",
"name": "Liliana Odom",
"gender": "female",
"company": "PLASTO",
"email": "[email protected]",
"phone": "+1 (983) 474-3785",
"address": "121 Sedgwick Place, Farmington, Marshall Islands, 2593",
"about": "Adipisicing veniam ex nulla irure minim incididunt et irure est nostrud ex ut. Occaecat eu proident eu pariatur deserunt aliquip. Commodo ullamco incididunt consequat quis commodo irure elit quis. Aute et reprehenderit ad ipsum magna cupidatat magna minim sunt labore mollit occaecat. Dolore sint veniam deserunt excepteur.",
"registered": "2015-05-07T05:40:28 -02:00",
"latitude": -46.141522,
"longitude": -157.943368,
"tags": [
"labore",
"quis"
],
"friends": [
{
"id": 0,
"name": "Earline Bass"
}
],
"greeting": "Hello, Liliana Odom! You have 5 unread messages.",
"favoriteFruit": "apple"
}
]
更新2:
我想這現在:
id = 1
with open('test.json') as json_data:
data = json.load(json_data)
for dat in data:
print(json.dumps(dat))
es.index(index='testdata', doc_type='generated', id=id, body=json.dumps(dat))
id += 1
打印(json.dumps(DAT))的作品,但我現在得到一個IllegalArgumentException:
Traceback (most recent call last):
File "app.py", line 15, in <module>
es.index(index='testdata', doc_type='generated', id=id, body=json.dumps(dat))
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 300, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 318, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 128, in perform_request
self._raise_error(response.status, raw_data)
File "/home/elk/Documents/pythonelastic/venv/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 124, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, u'illegal_argument_exception', u'[Bloodstorm][127.0.0.1:9300][indices:data/write/index[p]]')
更新3: Hereis ES日誌,貌似id字段是該指數定義了兩次。
[2017-04-12 17:43:07,847][DEBUG][action.index ] [Bloodstorm] failed to execute [index {[testdata][generated][AVti1SY7fn4azWzi8gyQ], source[{"guid": "45476739-80b3-49de-8f00-9923f84f56ce", "index": 0, "favoriteFruit": "apple", "latitude": -46.141522, "company": "PLASTO", "email": "[email protected]", "picture": "http://placehold.it/32x32", "tags": ["labore", "quis"], "registered": "2015-05-07T05:40:28 -02:00", "eyeColor": "blue", "phone": "+1 (983) 474-3785", "address": "121 Sedgwick Place, Farmington, Marshall Islands, 2593", "friends": [{"id": 0, "name": "Earline Bass"}], "isActive": true, "about": "Adipisicing veniam ex nulla irure minim incididunt et irure est nostrud ex ut. Occaecat eu proident eu pariatur deserunt aliquip. Commodo ullamco incididunt consequat quis commodo irure elit quis. Aute et reprehenderit ad ipsum magna cupidatat magna minim sunt labore mollit occaecat. Dolore sint veniam deserunt excepteur.", "balance": "$2,882.08", "name": "Liliana Odom", "gender": "female", "age": 31, "greeting": "Hello, Liliana Odom! You have 5 unread messages.", "longitude": -157.943368, "_id": "58ee19e75ffc814d4dff17da"}]}] on [[testdata][3]]
java.lang.IllegalArgumentException: Field [_id] is defined twice in [generated]
at org.elasticsearch.index.mapper.MapperService.checkFieldUniqueness(MapperService.java:496)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:376)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:320)
at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.applyRequest(MetaDataMappingService.java:306)
at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.execute(MetaDataMappingService.java:230)
at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:480)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:784)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
看來我要: 'with打開( 'test.json')作爲json_data: #D = json.load(json_data) 打印(json_data) es.index(指數='TESTDATA ',doc_type ='generated',id = 1,body = json_data)' 給我這個新錯誤 'elasticsearch.exceptions.SerializationError :((type :) )似乎反引號不起作用來標記內聯代碼 –
PouletFreak
您應該更新您的問題與該錯誤,所以它更清晰。你也可以分享你的'test.json'文件的內容嗎? – Val
對不起,我在這裏比較新;-),更新了我的問題 – PouletFreak