2016-10-11 173 views
1

我試圖將包含在數組中的幾個字典轉換爲熊貓數據框。該類型的字典保存爲這樣:將嵌套數組轉換爲python中的熊貓數據框

[[{u'category': u'anti-social-behaviour',u'location': {u'latitude': u'52.309886', 
u'longitude': u'0.496902'},u'month': u'2015-01'},{u'category': u'anti-social-behaviour',u'location': {u'latitude': u'52.306209', 
u'longitude': u'0.490475'},u'month': u'2015-02'}]] 

我想我的數據格式化爲以下格式:

 Category  Latitude Longitude 
0 anti-social 524498.597 175181.644 
1 anti-social 524498.597 175181.644 
2 anti-social 524498.597 175181.644 
. ...   ... 
. ...   ... 
. ...   ... 

我試圖將數據強制到數據幀與下面的代碼,但它不會產生預期的輸出。

for i in crimes: 
    for x in i: 
     print pd.DataFrame([x['category'], x['location']['latitude'], x['location']['longitude']]) 

我對Python很新,所以任何鏈接/技巧來幫助我構建這個數據框將不勝感激!

回答

1

您處於正確的軌道上,但您正在爲每一行創建一個新的數據框,但未給出正確的columns。下面的代碼片段應該工作:

import pandas as pd 
import numpy as np 

crimes = [[{u'category': u'anti-social-behaviour',u'location': {u'latitude': u'52.309886', 
u'longitude': u'0.496902'},u'month': u'2015-01'},{u'category': u'anti-social-behaviour',u'location': {u'latitude': u'52.306209', 
u'longitude': u'0.490475'},u'month': u'2015-02'}]] 

# format into a flat list 
formatted_crimes = [[x['category'], x['location']['latitude'], x['location']['longitude']] for i in crimes for x in i] 

# now pass the formatted list to DataFrame and label the columns 
df = pd.DataFrame(formatted_crimes, columns=['Category', 'Latitude', 'Longitude']) 

結果是:

   Category Latitude Longitude 
0 anti-social-behaviour 52.309886 0.496902 
1 anti-social-behaviour 52.306209 0.490475 
相關問題