2014-05-11 141 views
0

合計操作我有一個熊貓數據幀:GROUPBY和熊貓據幀

 Date  Type  Section  Status 
-------------------------------------------- 
0  1-Apr Type1  A   Present 
1  1-Apr Type2  A   Absent 
2  1-Apr Type2  A   Present 
3  1-Apr Type1  B   Absent 
4  2-Apr Type1  A   Present 
5  2-Apr Type2  C   Present 
6  2-Apr Type2  C   Present  

我想的DF GROUPBY到有點不同格式:

 Date  Type  A_Pre A_Abs B_Pre B_Abs C_Pre C_Abs 
------------------------------------------------------------------------------ 
0  1-Apr Type1  1 0  0  1  0  0 
1    Type2  1 1  0  0  0  0 
2  2-Apr Type1  1 0  0  0  0  0   
3    Type2  0 0  0  0  1  1   

我想要得到的聚合從原始表格報告條目按日期和類型分組,然後拆分爲各種類型。在嘗試2天后,我不知道如何處理這種方法。

任何幫助將不勝感激。

+0

這是這幾乎是重複的:http://stackoverflow.com/questions/23580009/data-processing-with-adding-columns-dynamically-in-python-pandas-dataframe – Jeff

+0

感謝您的鏈接,檢查出來。如果我能得到我的答案,我會在這裏更新它。非常感謝。 –

+0

其實你的問題可能會更簡單,試試:df.groupby(..)['Status']。apply(pd.get_dummies) – Jeff

回答

1

首先,我將創建要聚集填充零和的,然後用GROUPBY並做值的簡單總和列...

我沒有得到嘗試了這一點,但我認爲下面應該工作:

Present = ['A_Pre', 'B_Pre', 'C_Pre' ] 
Absent = ['A_Abs', 'B_Abs', 'C_Abs' ] 

for string in Present: 
    DF[string] = pd.Series([1 if stat == 'Present' and sect == string[0] else 0 
          for stat, sect in zip(DF['Status'], DF['Section'])], 
          index = DF.index) 
for string in Absent: 
    DF[string] = pd.Series([1 if stat == 'Absent' and sect == string[0] else 0 
          for stat, sect in zip(DF['Status'], DF['Section'])], 
          index = DF.index) 

DF.groupby(['Date', 'type']).agg(sum)