2017-10-15 247 views
0

我想解析一個熊貓列表嵌套列表的數據框。列表熊貓數據框列表

這是列表的樣本:

>>>result[1] 
{ 
    "account_currency": "BRL", 
    "account_id": "1600343406676896", 
    "account_name": "aaa", 
    "buying_type": "AUCTION", 
    "campaign_id": "aaa", 
    "campaign_name": "aaaL", 
    "canvas_avg_view_percent": "0", 
    "canvas_avg_view_time": "0", 
    "clicks": "1", 
    "cost_per_total_action": "8.15", 
    "cpm": "60.820896", 
    "cpp": "61.278195", 
    "date_start": "2017-10-08", 
    "date_stop": "2017-10-15", 
    "device_platform": "desktop", 
    "frequency": "1.007519", 
    "impression_device": "desktop", 
    "impressions": "134", 
    "inline_link_clicks": "1", 
    "inline_post_engagement": "1", 
    "objective": "CONVERSIONS", 
    "outbound_clicks": [ 
     { 
      "action_type": "outbound_click", 
      "value": "1" 
     } 
    ], 
    "platform_position": "feed", 
    "publisher_platform": "facebook", 
    "reach": "133", 
    "social_clicks": "1", 
    "social_impressions": "91", 
    "social_reach": "90", 
    "spend": "8.15", 
    "total_action_value": "0", 
    "total_actions": "1", 
    "total_unique_actions": "1", 
    "unique_actions": [ 
     { 
      "action_type": "landing_page_view", 
      "value": "1" 
     }, 
     { 
      "action_type": "link_click", 
      "value": "1" 
     }, 
     { 
      "action_type": "page_engagement", 
      "value": "1" 
     }, 
     { 
      "action_type": "post_engagement", 
      "value": "1" 
     } 
    ], 
    "unique_clicks": "1", 
    "unique_inline_link_clicks": "1", 
    "unique_outbound_clicks": [ 
     { 
      "action_type": "outbound_click", 
      "value": "1" 
     } 
    ], 
    "unique_social_clicks": "1" 
} 

當我將其轉換成數據幀熊貓,我得到:

>>>df = pd.DataFrame(result) 
>>>df 
.... 

unique_actions \ 
NaN 
[{u'value': u'1', u'action_type': u'landing_pa... 
NaN 
[{u'value': u'2', u'action_type': u'landing_pa... 
[{u'value': u'4', u'action_type': u'landing_pa... 
NaN 

獨特的動作和一些其它過濾器不歸。

我該如何規範化它到相同的粒度?

+0

「歸一化到相同粒度」是什麼意思?你究竟希望你的結果看起來像什麼? –

+0

你的結構實際上是一個json文件。 – Parfait

+0

@Parfait明白了。我怎樣才能在轉置的列中打開它? –

回答

1

您可以使用json_normalize,像這樣:

pd.io.json.json_normalize(df.unique_actions) 
+0

我得到這個錯誤:AttributeError:'浮動'對象沒有屬性'itervalues' –

1

考慮json_normalize在嵌套列表傳遞作爲record_path和所有其他指標。但是,因爲您有多個嵌套列表,json將傳輸三個數據幀的信息:

from pandas.io.json import json_normalize 


merge_fields = ['account_currency', 'account_id', 'account_name', 'buying_type', 'campaign_id', 
       'campaign_name', 'canvas_avg_view_percent', 'canvas_avg_view_time', 'clicks', 
       'cost_per_total_action', 'cpm', 'cpp', 'date_start', 'date_stop', 'device_platform', 
       'frequency', 'impression_device', 'impressions', 'inline_link_clicks', 'inline_post_engagement', 
       'objective', 'platform_position', 'publisher_platform', 'reach', 'social_clicks', 'social_impressions', 
       'social_reach', 'spend', 'total_action_value', 'total_actions', 'total_unique_actions', 
       'unique_clicks', 'unique_inline_link_clicks', 'unique_social_clicks'] 


unique_actions_df = json_normalize(result[1], record_path='unique_actions', meta=merge_fields) 

outbound_clicks_df = json_normalize(result[1], record_path='outbound_clicks', meta=merge_fields) 

unique_outbound_clicks_df = json_normalize(result[1], record_path='unique_outbound_clicks', meta=merge_fields) 
+0

我得到TypeError:字符串索引必須是整數 –

+0

你是否傳遞完全你是什麼發佈,結果[1]或其他項目?如果結構在列表項中相同,則需要遍歷'result'。 – Parfait