2013-08-27 55 views
1

我有一個函數foo在數據幀上運行;特別是數據幀的兩列。 因此,像,熊貓:複製SASs proc意味着通過屬性out = agg

def foo(group): 
    A = group['A'] 
    B = group['B'] 
    r1 = somethingfancy(A,B) #this is now a float 
    r2 = somethinggreat(A,B) #this is another float 
    return {'fancy':r1,'great':r2} 

的問題是,我想在下列情況下使用此功能:

grouped = otherDF[['someAttribute','A','B']].groupby(['someAttribute']) 
agg = grouped.apply(foo) 

的問題是,AGG,現已形成系列化DICT的。我想將它轉換爲數據幀,將基本上是這樣的:

someAttribute, fancy, great 
...   , ... , ... 

回答

2

而不是返回dict的,返回Series

def foo(group): 
    A = group['A'] 
    B = group['B'] 
    r1 = randn() 
    r2 = randn() 
    return Series({'fancy': r1, 'great': r2}) 

df = DataFrame(randn(10, 1), columns=['a']) 
df['B'] = np.random.choice(['hot', 'cold'], size=10) 
df['A'] = np.random.choice(['sweet', 'sour'], size=10) 
df['someAttribute'] = np.random.choice(['pretty', 'ugly'], size=10) 
print df.groupby('someAttribute').apply(foo) 

    fancy  great 
someAttribute      
pretty    -2.35  0.01 
ugly    1.09  -1.09 

如果你想someAttribute成爲結果列,請撥打reset_index的結果:

df.groupby('someAttribute').apply(foo).reset_index() 

獲得:

someAttribute  fancy  great 
0  pretty  0.46  -1.08 
1   ugly  0.76  0.29 
+0

真棒。謝謝! – tipanverella