2017-03-04 37 views
1

我有一個DataFrame,其中包含一個包含字典的特定列。創建一個新的DataFrame,將列字典中的每個鍵添加爲標題

我想在數據幀中包含類型的字典列,每個元素上找到的每個鍵分配給這些新的細胞每一個新的值應該對應於None添加一個新的頭,如果該元素不包含標題密鑰以及相應的鍵值。

這裏的數據進行檢測和可視化我在說什麼:

進口的依賴關係:

import pandas as pd 
import numpy as np 

創建包含一個內部字典列表字典:

data = {'string_info': ['User1', 'User2', 'User3'], 
     'dict_info': [{'elm1': 'attr5', 'elm2': 'attr9', 'elm3': 'attr33'}, 
       {'elm5': 'attr31', 'elm7': 'attr13'}, 
       {'elm5': 'attr28', 'elm1': 'attr23', 'elm2': 'attr33','elm6': 'attr33'}], 
     'int_info': [4, 24, 31],} 

創建適當的初始DataFrame進行測試:

df = pd.DataFrame.from_dict(data) 
df 

手動說明什麼,我想作爲輸出:

data2 = {'string_info': ['User1', 'User2', 'User3'], 
     'elm1': ['attr5',None,'attr23'], 
     'elm2': ['attr9',None,'attr33'], 
     'elm3': ['attr33',None,None], 
     'elm4': [None,None,None], 
     'elm5': [None,'attr31',None], 
     'elm6': [None,None,'attr33'], 
     'elm7': [None,None,'attr13'], 
     'int_info': [4, 24, 31]} 

所需的輸出將是:

df2 = pd.DataFrame.from_dict(data2) 
df2 

謝謝!

回答

1

您可以使用concatDataFrame構造函數替換dict到列:

print (pd.DataFrame(df.dict_info.values.tolist())) 
    elm1 elm2 elm3 elm5 elm6 elm7 
0 attr5 attr9 attr33  NaN  NaN  NaN 
1  NaN  NaN  NaN attr31  NaN attr13 
2 attr23 attr33  NaN attr28 attr33  NaN 

print (pd.concat([pd.DataFrame(df.dict_info.values.tolist()), 
        df[['int_info','string_info']]], axis=1)) 
    elm1 elm2 elm3 elm5 elm6 elm7 int_info string_info 
0 attr5 attr9 attr33  NaN  NaN  NaN   4  User1 
1  NaN  NaN  NaN attr31  NaN attr13  24  User2 
2 attr23 attr33  NaN attr28 attr33  NaN  31  User3 

如果需要None的add replace

print (pd.concat([pd.DataFrame(df.dict_info.values.tolist()).replace({np.nan:None}), 
        df[['int_info','string_info']]], axis=1)) 
    elm1 elm2 elm3 elm5 elm6 elm7 int_info string_info 
0 attr5 attr9 attr33 None None None   4  User1 
1 None None None attr31 None attr13  24  User2 
2 attr23 attr33 None attr28 attr33 None  31  User3 
+0

非常感謝你,它的工作!我肯定會檢查更多關於pd.concat,謝謝! – EduGord

相關問題