2017-04-11 120 views
0

元組鍵(KEY0,KEY1)字典(df_dict)我想子集中有幾個dataframes,每個數據幀中的列日期accountNum。我想要子集df_dict並基於key0生成字典名稱。子集的元組密鑰詞典與名字典基於鍵

df_dict = {('100', '001'): date, accountNum, ('100', '002'): date, accountNum, 
      ('200', '001'): date, accountNum, ('200', '002'): date, accountNum} 

在df_dict的dataframes如下所示,

('100','001')-DataFrame ('100','002')-DataFrame ('200','001')-DataFrame 
date  accountNum date  accountNum data  accountNum 
2010-01-01  280  2010-02-01  150  2010-03-01  330 
2010-01-02  285  2010-02-02  155  2010-03-02  335 
2010-01-03  290  2010-02-03  160  2010-03-03  340 

('200','002')-DataFrame 
date  accountNum 
2010-04-01  510 
2010-04-02  515 
2010-04-03  520 

我預計會是這樣的結果,

df_dict_100 = {('100', '001'): date, accountNum, ('100','002'): date, accountNum} 
df_dict_200 = {('200', '001'): date, accountNum, ('200','002'): date, accountNum} 

而且在每個字典中的數據幀將像,

df_dict100 
('100','001')-DataFrame ('100','002')-DataFrame 
date  accountNum date  accountNum 
2010-01-01  280  2010-02-01  150  
2010-01-02  285  2010-02-02  155  
2010-01-03  290  2010-02-03  160  

df_dict200 
('200','001')-DataFrame ('200','002')-DataFrame 
date  accountNum date   accountNum 
2010-01-01  280  2010-04-01  510 
2010-01-02  285  2010-04-02  515 
2010-01-03  290  2010-04-03  520 

這個是我的做法,

my_list = ['100','200'] 
subset_dict = {k: v for k, v in df_dict.items() if k[0] in my_list} 

但似乎我從df_dict得到確切的字典。

+2

我假設'date'和'accountNum'意味着在每個鍵中都有一個元組?因爲你目前的語法會引發錯誤。 – Wright

+0

'df_dict = {('100','001'):date,accountNum,('100','002'):date,accountNum,'給出'SyntaxError:無效語法' 'accountNum'。 – martineau

回答

0

通過創建多級字典,可以將第一種形式轉換爲第二種形式。因此,而不是df_dict_100,你可能有df_dict[100],就像這樣:

import pprint 

date, accountNum = 'date', 'accountNum' 
df_dict = {('100', '001'): (date, accountNum), ('100', '002'): (date, accountNum), 
      ('200', '001'): (date, accountNum), ('200', '002'): (date, accountNum)} 

new_dict = dict() 
for key, value in df_dict.items(): 
    new_dict.setdefault(key[0], {})[key] = value 

pprint.pprint(new_dict) 

所以,結果是:

{'100': {('100', '001'): ('date', 'accountNum'), 
     ('100', '002'): ('date', 'accountNum')}, 
'200': {('200', '001'): ('date', 'accountNum'), 
     ('200', '002'): ('date', 'accountNum')}} 

要訪問個人數據,你可能會使用類似語法:

print(new_dict['100']['100', '001'][0]) 

如果你喜歡字典理解,試試這個:

subset_dict = { 
    matching_key : { 
     k: v for k, v in df_dict.items() if k[0] == matching_key } 
    for matching_key in set(k[0] for k in df_dict) 
} 

在評論中,OP詢問「我可以知道如何生成兩個字典,而不是一個字典中的兩個字典?」像這樣的東西應該工作:

df_dict_100 = { k: v for k, v in df_dict.items() if k[0] == '100' } 
df_dict_200 = { k: v for k, v in df_dict.items() if k[0] == '200' } 

把這些彙集成for環路,這裏是一個完整的程序:

import pprint 

date, accountNum = 'date', 'accountNum' 
df_dict = {('100', '001'): (date, accountNum), ('100', '002'): (date, accountNum), 
      ('200', '001'): (date, accountNum), ('200', '002'): (date, accountNum)} 

my_list = ['100', '200'] 
for i in my_list: 
    new_df_dict = { k: v for k, v in df_dict.items() if k[0] == i } 
    pprint.pprint(new_df_dict) 
    print("----") 

這裏是輸出:

{('100', '001'): ('date', 'accountNum'), 
('100', '002'): ('date', 'accountNum')} 
---- 
{('200', '001'): ('date', 'accountNum'), 
('200', '002'): ('date', 'accountNum')} 
---- 
+0

謝謝,羅布。這很棒!但是我可以知道如何在一個字典(subset_dict)中生成兩個字典而不是兩個字典嗎? – Peggy

+0

當然,只要展開外部循環。看到我最近的編輯。 –

+0

是否有可能一次生成兩個字典,以便我們不需要在k [0] =='100'}的情況下爲df_dict_100 = {k:v代替df_dict.items()和'df_dict_200 = {k:v for k,v in df_dict。items()if k [0] =='200'}' – Peggy