2017-04-21 107 views
3

我具有由月和年爲多個傳感器的傳感器數據:大熊貓:計算從所述差的分組的平均

import pandas as pd 
df = pd.DataFrame([ 
['A', 'Jan', 2015, 13], 
['A', 'Feb', 2015, 10], 
['A', 'Jan', 2016, 12], 
['A', 'Feb', 2016, 11], 
['B', 'Jan', 2015, 7], 
['B', 'Feb', 2015, 8], 
['B', 'Jan', 2016, 4], 
['B', 'Feb', 2016, 9] 
], columns = ['sensor', 'month', 'year', 'value']) 

In [2]: df 
Out[2]: 
    sensor month year value 
0  A Jan 2015  13 
1  A Feb 2015  10 
2  A Jan 2016  12 
3  A Feb 2016  11 
4  B Jan 2015  7 
5  B Feb 2015  8 
6  B Jan 2016  4 
7  B Feb 2016  9 

予算出的平均每個傳感器和每月用GROUPBY:

month_avg = df.groupby(['sensor', 'month']).mean()['value'] 

In [3]: month_avg 
Out[3]: 
sensor month 
A  Feb  10.5 
     Jan  12.5 
B  Feb  8.5 
     Jan  5.5 

現在我想添加一列到df與月平均值的差異,如下所示:

sensor month year value diff_from_avg 
0  A Jan 2015  13 1.5 
1  A Feb 2015  10 2.5 
2  A Jan 2016  12 0.5 
3  A Feb 2016  11 0.5 
4  B Jan 2015  7 2.5 
5  B Feb 2015  8 0.5 
6  B Jan 2016  4 -1.5 
7  B Feb 2016  9 -0.5 

我想多索引dfavgs_by_month相似,並試圖簡單的減法,但沒有好:

df = df.set_index(['sensor', 'month']) 
df['diff_from_avg'] = month_avg - df.value 

謝謝你的任何建議。

回答

4

assigntransform

diff_from_avg=df.value - df.groupby(['sensor', 'month']).value.transform('mean') 
df.assign(diff_from_avg=diff_from_avg) 

    sensor month year value diff_from_avg 
0  A Jan 2015  13   0.5 
1  A Feb 2015  10   -0.5 
2  A Jan 2016  12   -0.5 
3  A Feb 2016  11   0.5 
4  B Jan 2015  7   1.5 
5  B Feb 2015  8   -0.5 
6  B Jan 2016  4   -1.5 
7  B Feb 2016  9   0.5 
+1

當然!對我來說太快了!我應該開始使用'assign',如果只是爲了更快地寫出答案! –

+0

這看起來不錯,但是我在第一行得到了一個無益的錯誤:'AttributeError:'NoneType'對象沒有屬性'transform''。任何想法這可能意味着什麼? – robroc

+0

@ juanpa.arrivillaga我使用'assign',因爲我不喜歡打破'df' ..特別是當我可能很好的時候操作。 – piRSquared

0

您需要設置數據框的索引與分組系列一致的新列,那麼你可以直接減去:

df.set_index(['sensor','month'], inplace=True) df['diff'] = df['value'] - month_avg

2

嘗試:

df['diff_from_avg']=df.groupby(['sensor','month'])['value'].apply(lambda x: x-x.mean()) 
Out[18]: 
    sensor month year value diff_from_avg 
0  A Jan 2015  13   0.5 
1  A Feb 2015  10   -0.5 
2  A Jan 2016  12   -0.5 
3  A Feb 2016  11   0.5 
4  B Jan 2015  7   1.5 
5  B Feb 2015  8   -0.5 
6  B Jan 2016  4   -1.5 
7  B Feb 2016  9   0.5