2017-04-24 83 views
1

我有一些寫在一個類似字典的格式的數據文件:導入類似字典的數據轉換成熊貓

{"score": [0.9995803236961365, 0.00041968212462961674], "key": "Am2mVTMbhd0y", "label": "0"} 
{"score": [0.9997120499610901, 0.00028794570243917406], "key": "AmG8StB8hM2k", "label": "0"} 
{"score": [0.8841496109962463, 0.11585044860839844], "key": "Alt137zv2nY6", "label": "0"} 
{"score": [0.9999467134475708, 5.334055458661169e-05], "key": "AmGdF7cY4X22", "label": "0"} 

我想要做的就是將它們導入到大熊貓,與列作爲「關鍵','標籤'和'分數',並且必須將兩個數字值放在單獨的列中。我已經嘗試導入文件作爲字典,但我得到:

ValueError: too many values to unpack 

有關如何解決此問題的任何建議?

+0

這個錯誤occour因爲你的文件可能包含一些錯誤這是不符合字典格式 –

回答

0

我認爲你需要參數lines=Trueread_json

df = pd.read_json('file.json', lines=True) 
print (df) 
      key label           score 
0 Am2mVTMbhd0y  0 [0.999580323696136, 0.00041968212462900004] 
1 AmG8StB8hM2k  0 [0.9997120499610901, 0.00028794570243900004] 
2 Alt137zv2nY6  0  [0.8841496109962461, 0.11585044860839801] 
3 AmGdF7cY4X22  0 [0.99994671344757, 5.3340554586611695e-05] 

print (type(df['score'].iat[0])) 
<class 'list'> 

對於轉換lists到列使用DataFrame構造與concat

df = pd.concat([df.drop('score', 1), 
       pd.DataFrame(df['score'].values.tolist()).add_prefix('score')], axis=1) 
print (df) 
      key label score0 score1 
0 Am2mVTMbhd0y  0 0.999580 0.000420 
1 AmG8StB8hM2k  0 0.999712 0.000288 
2 Alt137zv2nY6  0 0.884150 0.115850 
3 AmGdF7cY4X22  0 0.999947 0.000053 
+0

完美!謝謝! –

0
import pandas as pd 

#add your data in a list 
data = [{"score": [0.9995803236961365, 0.00041968212462961674], "key": "Am2mVTMbhd0y", "label": "0"}, 
{"score": [0.9997120499610901, 0.00028794570243917406], "key": "AmG8StB8hM2k", "label": "0"}, 
{"score": [0.8841496109962463, 0.11585044860839844], "key": "Alt137zv2nY6", "label": "0"}, 
{"score": [0.9999467134475708, 5.334055458661169e-05], "key": "AmGdF7cY4X22", "label": "0"}] 
#create dataframe 
df = pd.DataFrame(data)