2017-05-25 160 views
-2
import json 
import csv 
from watson_developer_cloud import NaturalLanguageUnderstandingV1 
import watson_developer_cloud.natural_language_understanding.features.v1 as \ 
    features 


natural_language_understanding = NaturalLanguageUnderstandingV1(
    version='2017-02-27', 
    username='b6dd1781-02e4-4dca-a706-05597d574221', 
    password='c3ked6Ttmmc1') 

response = natural_language_understanding.analyze(
    text='Bruce Banner is the Hulk and Bruce Wayne is BATMAN! ' 
     'Superman fears not Banner, but Wayne.', 
    features=[features.Entities()]) 

response1 = natural_language_understanding.analyze(
    text='Bruce Banner is the Hulk and Bruce Wayne is BATMAN! ' 
     'Superman fears not Banner, but Wayne.', 
    features=[features.Keywords()]) 

#print response.items()[0][1][1] 
make= json.dumps(response, indent=2) 
make1= json.dumps(response1, indent=2) 
print make 
print make1 

x = json.loads(make) 

f = csv.writer(open("Entities.csv", "wb+")) 


f.writerow(["relevance", "text", "type", "count"]) 

for x1 in x: 
    f.writerow([x1['relevance'], 
       x1['text'], 
       x1['type'], 
       x1['count']]) 

上面的make變量包含一個必須轉換爲CSV的JSON,並且這樣做時我得到一個類型爲TypeError的錯誤:字符串索引必須是整數。實際的問題是我無法通過實體並獲得關鍵值對,有人可以告訴我在這裏可以做些什麼? JSON將JSON轉換爲CSV

{ 
    "entities": [ 
    { 
     "relevance": 0.931351, 
     "text": "Bruce Banner", 
     "type": "Person", 
     "count": 3 
    }, 
    { 
     "relevance": 0.288696, 
     "text": "Wayne", 
     "type": "Person", 
     "count": 1 
    } 
    ], 
    "language": "en" 
} 
+1

請包括產生短節目你所描述的錯誤。請包括您的實際和預期的程序輸出。 –

+0

你可以把數據放在excel中,並記錄將該數據解析成.csv的宏然後你可以將該腳本轉換成python等等...... – DeerSpotter

回答

0

如果轉儲JSON結構法和數據到一個文件 - 你可以使用這個腳本來處理的關鍵是:值到CSV文件中:

# -*- coding: utf-8 -*- 
""" 
Created on Fri May 26 01:24:44 2017 

@author: ITZIK CHAIMOV 
""" 
import csv 


labels = []  #prepare empty list of labels and values 
values = [] 

fin = open('dataFile.json', 'r') #assuming you have dumped the data into a json file (as you showed at the example) 
#numberOfLines = fin.readlines() 
#for line in range(numberOfLines): 
buffer = fin.readline() 
buffer = fin.readline() 
while (buffer!=''): 
    while not(buffer.__contains__('"en"')): 
     if buffer.__contains__('{'): 
      buffer = fin.readline() 
      while not(buffer.__contains__('}')): 
       labels.append(buffer.split(':')[0].strip()) 
       values.append(buffer.split(':')[1].strip()) 
       buffer = fin.readline() 
     buffer=fin.readline() 
    break 
fin.close() 
n=size(labels) 
firstLabel = labels[0] 
i=0 
for lbl in labels: 
    if ((firstLabel == lbl) & (i!=0)): 
     break 
    i+=1 

tbl = [] 
tbl.append(labels[0:i]) 
for j in range(int(n/i)): 
    tbl.append(values[j*i:(j+1)*i]) 


fout = open('testfile.csv', 'w') 
csv_write = csv.writer(fout) 
csv_write.writerows(tabl) 
fout.close() 

CSV file shown at Excel - the '/" signs can be removed

-1

x1

結構返回結構x的密鑰。要訪問與每個密鑰關聯的值,您需要執行x[x1],否則,您正在尋找x1中名爲'relevance'的索引,該索引是string類型的鍵。

x包含整個JSON結構。您只對由「實體」關鍵字索引的列表(由單個字典組成)感興趣。所以你首先只能訪問它,然後通過每個鍵值對。

x1 = x['entities'][0] 
f.writerow([x1['relevance'], 
       x1['text'], 
       x1['type'], 
       x1['count']]) 

第二個關鍵是'語言',它返回一個字符串'en',而不是一個字典。

+0

你能告訴我如何通過編輯代碼來完成它嗎?我理解你的邏輯,但可以理解如何編碼 –

+0

看看我編輯的答案。 – Antimony