2016-08-24 100 views
4

一列與和內容我有一個數據幀merged_df_energy集團通過另一列與蟒蛇

+------------------------+------------------------+------------------------+--------------+ 
| ACT_TIME_AERATEUR_1_F1 | ACT_TIME_AERATEUR_1_F3 | ACT_TIME_AERATEUR_1_F5 | class_energy | 
+------------------------+------------------------+------------------------+--------------+ 
| 63.333333    | 63.333333    | 63.333333    | low   | 
| 0      | 0      | 0      | high   | 
| 45.67     | 0      | 55.94     | high   | 
| 0      | 0      | 23.99     | low   | 
| 0      | 20      | 23.99     | medium  | 
+------------------------+------------------------+------------------------+--------------+ 

我想爲每一個ACT_TIME_AERATEUR_1_FxACT_TIME_AERATEUR_1_F1ACT_TIME_AERATEUR_1_F3ACT_TIME_AERATEUR_1_F5)一個數據幀至極包含這些列:class_energysum_time

例如,對於對應於ACT_TIME_AERATEUR_1_F1數據幀:

+-----------------+-----------+ 
| class_energy | sum_time | 
+-----------------+-----------+ 
| low    | 63.333333 | 
| medium   | 0   | 
| high   | 45.67  | 
+-----------------+-----------+ 

我做的事情我應該這樣使用組:

data.groupby(by=['class_energy'])['sum_time'].sum() 

任何主意,幫幫我好嗎?

回答

6

您可以添加所有列[]用於聚集:

print (df.groupby(by=['class_energy'])['ACT_TIME_AERATEUR_1_F1', 'ACT_TIME_AERATEUR_1_F3','ACT_TIME_AERATEUR_1_F5'].sum()) 
       ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \ 
class_energy             
high      45.670000    0.000000 
low      63.333333    63.333333 
medium      0.000000    20.000000 

       ACT_TIME_AERATEUR_1_F5 
class_energy       
high      55.940000 
low      87.323333 
medium      23.990000 

您可以使用還參數as_index=False

print (df.groupby(by=['class_energy'], as_index=False)['ACT_TIME_AERATEUR_1_F1', 'ACT_TIME_AERATEUR_1_F3','ACT_TIME_AERATEUR_1_F5'].sum()) 
    class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \ 
0   high    45.670000    0.000000 
1   low    63.333333    63.333333 
2  medium    0.000000    20.000000 

    ACT_TIME_AERATEUR_1_F5 
0    55.940000 
1    87.323333 
2    23.990000 

如果需要骨料僅第一3列:

print (df.groupby(by=['class_energy'], as_index=False)[df.columns[:3]].sum()) 
    class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \ 
0   high    45.670000    0.000000 
1   low    63.333333    63.333333 
2  medium    0.000000    20.000000 

    ACT_TIME_AERATEUR_1_F5 
0    55.940000 
1    87.323333 
2    23.990000 

...或沒有最後一列的所有列:

print (df.groupby(by=['class_energy'], as_index=False)[df.columns[:-1]].sum()) 
    class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \ 
0   high    45.670000    0.000000 
1   low    63.333333    63.333333 
2  medium    0.000000    20.000000 

    ACT_TIME_AERATEUR_1_F5 
0    55.940000 
1    87.323333 
2    23.990000 
+0

非常感謝您的幫助:)親切的問候 – Poisson

+1

謝謝upvoting。我可以編輯你的問題更可讀嗎? – jezrael

+0

當然可以:) – Poisson