2017-07-07 126 views
2

我不得不關注來自API(例如my_json)的JSON。實體的數組存儲在一個關鍵稱爲實體:將JSON導入熊貓

{ 
    "action" : "get", 
    "application" : "4d97323f-ac0f-11e6-b1d4-0eec2415f3df", 
    "params" : { 
     "limit" : [ "2" ] 
    }, 
    "path" : "/businesses", 
    "entities" : [ 
     { 
      "uuid" : "508d56f1-636b-11e7-9928-122e0737977d", 
      "type" : "business", 
      "size" : 730 }, 
     { 
      "uuid" : "2f3bd4dc-636b-11e7-b937-0ad881f403bf", 
      "type" : "business", 
      "size" : 730 
     } ], 
    "timestamp" : 1499469891059, 
    "duration" : 244, 
    "count" : 2 
} 

我試圖將其加載到數據幀如下:

import pandas as pd 

pd.read_json(my_json['entities'], orient='split') 

我收到以下錯誤:

ValueError: Invalid file path or buffer object type: <type 'list'> 

我試過記錄方向,但仍然無法正常工作。

+0

能否請你加'my_json'的內容,你的問題? – Infinity

回答

0

你使用的方式my_json['entities']使它看起來像是一個Python dict

根據pandas documentation,read_json接受「有效的JSON字符串或文件樣」。難道可以將dict轉換成JSON strinrg有以下幾點:

import json 
json_str = json.dumps(my_json["entities"]) 

爲你描述它不適合的格式戰略orient="split"下的關鍵"entities"數據。它看起來像您將需要使用orient="list"

import pandas as pd 

my_json = """{ 
    "entities": [ 
      { 
       "type": "business", 
       "uuid": "199bca3e-baf6-11e6-861b-0ad881f403bf", 
       "size": 918 
      }, 
      { 
       "type": "business", 
       "uuid": "054a7650-b36a-11e6-a734-122e0737977d", 
       "size": 984 
      } 
     ] 
}""" 

print pd.read_json(my_json, orient='list') 

產生:

           entity 
0 {u'type': u'business', u'uuid': u'199bca3e-baf... 
1 {u'type': u'business', u'uuid': u'054a7650-b36... 

import pandas as pd 

my_json = """[ 
    { 
     "type": "business", 
     "uuid": "199bca3e-baf6-11e6-861b-0ad881f403bf", 
     "size": 918 
    }, 
    { 
     "type": "business", 
     "uuid": "054a7650-b36a-11e6-a734-122e0737977d", 
     "size": 984 
    } 
]""" 

print pd.read_json(my_json, orient='list') 

產生:

size  type         uuid 
0 918 business 199bca3e-baf6-11e6-861b-0ad881f403bf 
1 984 business 054a7650-b36a-11e6-a734-122e0737977d 
0

danielcorin我指出了正確的方向。我結束了必須做的:

pd.read_json(json.dumps(b_j['entities']) , orient='list') 

read_json方法需要一個字符串,所以我轉儲實體集合,並使用它。

2

如果my_json是一本字典,因爲我懷疑,那麼你可以跳過pd.read_json,只是做

pd.DataFrame(my_json['entities']) 

    size  type         uuid 
0 730 business 508d56f1-636b-11e7-9928-122e0737977d 
1 730 business 2f3bd4dc-636b-11e7-b937-0ad881f403bf