熊貓：當組中的值滿足所需條件時，將數據從組中刪除

我在數據和每組中都有值的分組，我想檢查組中的值是否低於8。如果滿足此條件，則整個組將從數據集中刪除。熊貓：當組中的值滿足所需條件時，將數據從組中刪除

請注意我提到的價值在於另一列到分組列。

示例輸入：

Groups Count 
    1  7 
    1  11 
    1  9 
    2  12 
    2  15 
    2  21

輸出：

Groups Count 
    2  12 
    2  15 
    2  21

來源

2016-01-09 Jean-Michel Laurence Nairac

根據您的問題說明什麼，只要有至少一個值低於該組中8，則該組應被丟棄。所以相當的陳述是，只要該組內的最小值低於8，該組就應該被丟棄。

通過使用過濾器功能，實際的代碼可以被減少到只有一條線，請參考Filtration，你可以使用下面的代碼：

dfnew = df.groupby('Groups').filter(lambda x: x['Count'].min()>8) 
dfnew.reset_index(drop=True, inplace=True) # reset index 
dfnew = dfnew[['Groups','Count']] # rearrange the column sequence 
print(dfnew) 

Output: 
    Groups Count 
0  2  12 
1  2  15 
2  2  21

來源

2016-01-11 06:09:41 2342G456DI8

這應該被標記爲有關問題的正確答案的OP – Daniel

啊..把我的評論弄亂了。這應該被標記爲關於OP問題的正確答案，因爲這是使用熊貓inbuild'groupby'函數最優雅的方式。它非常有效率，可讀性強，並且是一行代碼。 1UP – Daniel

可以使用isin，loc並unique與由倒掩模選擇子集。最後你可以reset_index：

print df 

    Groups Count 
0  1  7 
1  1  11 
2  1  9 
3  2  12 
4  2  15 
5  2  21 

print df.loc[df['Count'] < 8, 'Groups'].unique() 
[1] 

print ~df['Groups'].isin(df.loc[df['Count'] < 8, 'Groups'].unique()) 

0 False 
1 False 
2 False 
3  True 
4  True 
5  True 
Name: Groups, dtype: bool 

df1 = df[~df['Groups'].isin(df.loc[df['Count'] < 8, 'Groups'].unique())] 
print df1.reset_index(drop=True) 

    Groups Count 
0  2  12 
1  2  15 
2  2  21

來源

2016-01-09 07:22:35 jezrael

熊貓：當組中的值滿足所需條件時，將數據從組中刪除

回答

相關問題