2017-04-10 52 views
2
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'], 
       'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'], 
       'C' : [np.nan, 'bla2', np.nan, 'bla3', np.nan, np.nan, np.nan, np.nan]}) 

輸出:大熊貓在GROUPBY函數計算空值

 A  B  C 
0 foo one NaN 
1 bar one bla2 
2 foo two NaN 
3 bar three bla3 
4 foo two NaN 
5 bar two NaN 
6 foo one NaN 
7 foo three NaN 

我想用GROUPBY才能算NaN的數量爲foo的不同組合。

預期輸出(編輯):

 A  B  C D 
0 foo one NaN 2 
1 bar one bla2 0 
2 foo two NaN 2 
3 bar three bla3 0 
4 foo two NaN 2 
5 bar two NaN 1 
6 foo one NaN 2 
7 foo three NaN 1 

目前我想這一點:

df['count']=df.groupby(['A'])['B'].isnull().transform('sum') 

但是,這是不工作...

謝謝

+0

不應該你的輸出是一個:2,二:2和三個:1? – tagoma

回答

3

我想您需要groupbysumNaN值:

df2 = df.C.isnull().groupby([df['A'],df['B']]).sum().astype(int).reset_index(name='count') 
print (df2) 
    A  B count 
0 bar one  0 
1 bar three  0 
2 bar two  1 
3 foo one  2 
4 foo three  1 
5 foo two  2 

如果需要過濾器先加boolean indexing

df = df[df['A'] == 'foo'] 
df2 = df.C.isnull().groupby([df['A'],df['B']]).sum().astype(int) 
print (df2) 
A B  
foo one  2 
    three 1 
    two  2 

或者simplier:

df = df[df['A'] == 'foo'] 
df2 = df['B'].value_counts() 
print (df2) 
one  2 
two  2 
three 1 
Name: B, dtype: int64 

編輯:解決辦法很相似,只是增加transform

df['D'] = df.C.isnull().groupby([df['A'],df['B']]).transform('sum').astype(int) 
print (df) 
    A  B  C D 
0 foo one NaN 2 
1 bar one bla2 0 
2 foo two NaN 2 
3 bar three bla3 0 
4 foo two NaN 2 
5 bar two NaN 1 
6 foo one NaN 2 
7 foo three NaN 1 

類似所以lution:

df['D'] = df.C.isnull() 
df['D'] = df.groupby(['A','B'])['D'].transform('sum').astype(int) 
print (df) 
    A  B  C D 
0 foo one NaN 2 
1 bar one bla2 0 
2 foo two NaN 2 
3 bar three bla3 0 
4 foo two NaN 2 
5 bar two NaN 1 
6 foo one NaN 2 
7 foo three NaN 1 
+0

這完全回答了我最初的問題,但我只是意識到我有'預期的答案'錯了。對於那個很抱歉。我需要將結果添加到初始數據框中。 – Stefan

+0

當然,請檢查編輯答案。 – jezrael

1
df[df.A == 'foo'].groupby('b').agg({'C': lambda x: x.isnull().sum()}) 

回報:

=>  C 
B  
one 2 
three 1 
two 2