2017-08-30 39 views
0

在JSON文件中的記錄是這樣的(請注意什麼「營養素」的樣子):蟒蛇大熊貓 - 類型錯誤解析JSON時:字符串索引必須是整數

{ 
"id": 21441, 
"description": "KENTUCKY FRIED CHICKEN, Fried Chicken, EXTRA CRISPY, 
Wing, meat and skin with breading", 
"tags": ["KFC"], 
"manufacturer": "Kentucky Fried Chicken", 
"group": "Fast Foods", 
"portions": [ 
{ 
"amount": 1, 
"unit": "wing, with skin", 
"grams": 68.0 
}, 
... 
], 
"nutrients": [ 
{ 
"value": 20.8, 
"units": "g", 
"description": "Protein", 
"group": "Composition" 
}, 
{'description': 'Total lipid (fat)', 
'group': 'Composition', 
'units': 'g', 
'value': 29.2} 
... 
] 
} 

下面是從代碼書本練習*。它包括了一些爭論和每個食品營養組合成一個單一的大表:

import pandas as pd 
import json 

db = pd.read_json("foods-2011-10-03.json") 

nutrients = [] 

for rec in db: 
    fnuts = pd.DataFrame(rec["nutrients"]) 
    fnuts["id"] = rec["id"] 
    nutrients.append(fnuts) 

不過,我得到以下錯誤,我想不通爲什麼:


TypeError         Traceback (most recent call last) 
<ipython-input-23-ac63a09efd73> in <module>() 
     1 for rec in db: 
----> 2  fnuts = pd.DataFrame(rec["nutrients"]) 
     3  fnuts["id"] = rec["id"] 
     4  nutrients.append(fnuts) 
     5 

TypeError: string indices must be integers 

*這是書中的一個例子

+0

你JSON無效(甚至當更正引號並刪除這些點時,它不能由'pd.read_json'加載)。請提交我們實際可以看到您的問題的數據。 – Amadan

+0

@Amadan,這裏是鏈接到數據:https://github.com/wesm/pydata-book/blob/master/ch07/foods-2011-10-03.json –

回答

1

for rec in db遍歷列名。重複行,

for id, rec in db.iterrows(): 
    fnuts = pd.DataFrame(rec["nutrients"]) 
    fnuts["id"] = rec["id"] 
    nutrients.append(fnuts) 

雖然這是有點慢(所有的需要構建的字典)。 itertuples更快;但因爲你只關心兩個系列,遍歷一系列直接大概是最快的:

for id, value in zip(db['id'], db['nutrients']): 
    fnuts = pd.DataFrame(value) 
    fnuts["id"] = id 
    nutrients.append(fnuts) 
+0

謝謝,這工作正常!自從本書編寫以來,這種迭代的工作方式是否發生了變化,或者是否應該將其添加到本書的勘誤表中? –

+0

對不起,我不太瞭解熊貓的歷史,也沒有讀過這本書。 – Amadan

0

代碼工作得很好,但json應該看起來像這樣代碼工作:

[{ 
"id": 21441, 
"description": "KENTUCKY FRIED CHICKEN, Fried Chicken, EXTRA CRISPY,Wing, meat and skin with breading", 
"tags": ["KFC"], 
"manufacturer": "Kentucky Fried Chicken", 
"group": "Fast Foods", 
"portions": [ 
{"amount": 1, 
"unit": "wing, with skin", 
"grams": 68.0}], 
"nutrients": [{ 
"value": 20.8, 
"units": "g", 
"description": "Protein", 
"group": "Composition" 
}, 
{'description': 'Total lipid (fat)', 
'group': 'Composition', 
'units': 'g', 
'value': 29.2}]}] 

這是僅有一條記錄的示例。

0

Amadan回答了這個問題,但我看到他的回答設法解決它像這樣事先:

for i in range(len(db)): 
    rec = db.loc[i] 
    fnuts = pd.DataFrame(rec["nutrients"]) 
    fnuts["id"] = rec["id"] 
    nutrients.append(fnuts) 
相關問題