2015-11-06 172 views
8

我正在嘗試使用熊貓讀取JSON文件。讀取JSON文件時出錯

import pandas as pd 
df = pd.read_json('https://data.gov.in/node/305681/datastore/export/json') 

我得到valueError。

ValueError: arrays must all be same length 

其他一些JSON頁面顯示此錯誤:

ValueError: Mixing dicts with non-Series may lead to ambiguous ordering. 

我怎麼莫名其妙地讀出值?我不關心數據的有效性。

回答

7

望着JSON它是有效的,但它的嵌套數據和字段:

import json 
import requests 

In [11]: d = json.loads(requests.get('https://data.gov.in/node/305681/datastore/export/json').text) 

In [12]: list(d.keys()) 
Out[12]: ['data', 'fields'] 

你想要的數據內容,並作爲列名字段:

In [13]: pd.DataFrame(d["data"], columns=[x["label"] for x in d["fields"]]) 
Out[13]: 
    S. No.     States/UTs 2008-09 2009-10 2010-11 2011-12 2012-13 
0  1    Andhra Pradesh 183446.36 193958.45 201277.09 212103.27 222973.83 
1  2   Arunachal Pradesh  360.5  380.15  407.42  419  438.69 
2  3      Assam 4658.93 4671.22 4707.31  4705 4709.58 
3  4      Bihar 10740.43 11001.77 7446.08  7552 8371.86 
4  5     Chhattisgarh 9737.92 10520.01 12454.34 12984.44 13704.06 
5  6       Goa  148.61  148  149  149.45  457.87 
6  7      Gujarat 12675.35 12761.98 13269.23 14269.19 14558.39 
7  8      Haryana 38149.81 38453.06 39644.17 41141.91 42342.66 
8  9    Himachal Pradesh  977.3 1000.26 1020.62 1049.66 1069.39 
9  10   Jammu and Kashmir 7208.26 7242.01 7725.19  6519.8 6715.41 
10  11     Jharkhand 3994.77 3924.73 4153.16 4313.22 4238.95 
11  12     Karnataka 23687.61 29094.3 30674.18 34698.77 36773.33 
12  13      Kerala 15094.54 16329.52 16856.02 17048.89 22375.28 
13  14    Madhya Pradesh  6712.6 7075.48 7577.23 7971.53 8710.78 
14  15     Maharashtra 35502.28 38640.12 42245.1 43860.99 45661.07 
15  16      Manipur 1105.25  1119 1137.05 1149.17 1162.19 
16  17     Meghalaya  994.52  999.47 1010.77 1021.14 1028.18 
17  18      Mizoram  411.14  370.92  387.32  349.33  352.02 
18  19      Nagaland  831.92  833.5  802.03  703.65  617.98 
19  20      Odisha 19940.15 23193.01 23570.78 23006.87 23229.84 
20  21      Punjab 36789.7 32828.13 35449.01  36030 37911.01 
21  22     Rajasthan 6449.17 6713.38 6696.92 9605.43 10334.9 
22  23      Sikkim  136.51  136.07  139.83  146.24  146 
23  24     Tamil Nadu 88097.59 108475.73 115137.14 118518.45 119333.55 
24  25      Tripura 1388.41 1442.39 1569.45  1650 1565.17 
25  26    Uttar Pradesh 10139.8 10596.17 10990.72 16075.42 17073.67 
26  27     Uttarakhand 1961.81 2535.77 2613.81 2711.96 3079.14 
27  28     West Bengal 33055.7 36977.96 39939.32 43432.71 47114.91 
28  29 Andaman and Nicobar Islands  617.58  657.44  671.78  780  741.32 
29  30     Chandigarh  272.88  248.53  180.06  180.56  170.27 
30  31  Dadra and Nagar Haveli  70.66  70.71  70.28   73   73 
31  32    Daman and Diu  18.83  18.9  18.81  19.67   20 
32  33      Delhi  1.17  1.17  1.17  1.23   NA 
33  34     Lakshadweep  134.64  138.22  137.98  139.86  139.99 
34  35     Puducherry  111.69  112.84  113.53  116  112.89 

有關更復雜的json DataFrame提取,另請參閱json_normalize

1

下面列出的這兩個關鍵和價值,我對:

from urllib.request import urlopen 
import json 
from pandas.io.json import json_normalize 
import pandas as pd 
import requests 

df = json.loads(requests.get('https://api.github.com/repos/akkhil2012/MachineLearning').text) 

data = pd.DataFrame.from_dict(df, orient='index') 

print(data) 
1

EHT 對於這種情況,我們可以通過做

import pandas as pd 
df = pd.DataFrame(data["data"]) 
使數據幀