2017-08-25 199 views
0

我有一個熊貓數據幀爲:操作的列具有相同的值

import pandas as pd 
import numpy as np 

df1 = pd.DataFrame({'Start': [0, 71, 0, 0, 12, 56], 
         'End': [70, 88, 10, 11, 55, 90], 
        'Value': [1, 0, 1, 1, 0, 1], 
        'Name': ['A','A', 'B','C','C','C']}, 
         index =[0, 1, 2, 3, 4, 5]) 

對於每個名稱(「A」,「B」,「C」) 欲計算一些操作...

我怎樣才能以更pythonic的方式做以下事情?

df3 = pd.DataFrame(columns=['value1','value2','value3']) 


unique_names = list(df1['Name'].unique()) 

for name in unique_names: 

    df=df1.loc[df1['Name'] == name] 

    value1 = operation1(df) 
    value2 = operation2(df) 
    value3 = operation3(df)    

    df_temp = pd.DataFrame(np.array([value1,value2,value3]).reshape(-1,3),columns=['value1','value2','value3']) 
    df3 = pd.concat([df3, df_temp], ignore_index=True)[df3.columns.tolist()] 

回答

0

這就是df.groupby().apply()會進來

說,例如,你的三個操作如下:。

#get sum of all Start values 
operation1 = lambda x: x.Start.sum() 
#get sum of all End values 
operation2 = lambda x: x.End.sum() 
#get mean of Values 
operation3 = lambda x: x.Value.mean() 

您可以在一個操作之後,究竟你在做什麼acheive:

df3 = df1.groupby('Name').apply(
    lambda x: pd.Series(
     [operation1(x), operation2(x), operation3(x)], 
     index=['value1','value2','value3'] 
    )) 

返回:

value1 value2 value3 
Name    
A 71.0 158.0 0.500000 
B 0.0  10.0 1.000000 
C 68.0 156.0 0.666667