從數據框中選擇行，其中任一列高0.001

我通常會寫從數據框中選擇行，其中任一列高0.001

df[ (df.Col1>0.0001) | (df.Col2>0.0001) | (df.Col3>0.0001) ].index

獲得在條件成立的標籤。如果我有很多欄目，並說我有一個元組

cols = ('Col1', 'Col2', 'Col3')

cols是DF列的子集。

有沒有更簡潔的寫作方法？

來源

2014-11-21 MMM

您可以合併pandas.DataFrame.any和列表索引以創建用於索引的掩碼。請注意，cols必須是列表，而不是元組。

import pandas as pd 
import numpy as np 

N = 10 
M = 0.8 

df = pd.DataFrame(data={'Col1':np.random.random(N), 'Col2':np.random.random(N), 
         'Col3':np.random.random(N), 'Col4':np.random.random(N)}) 

cols = ['Col1', 'Col2', 'Col3'] 

mask = (df[cols] > M).any(axis=1) 

print(df[mask].index) 
# Int64Index([0, 1, 4, 5, 6, 7], dtype='int64')

來源

2014-11-21 14:56:56 Ffisegydd

可以使用「任意」或「全部」使用列表理解：

import pandas as pd 
import numpy as np 

In [148]: df = pd.DataFrame(np.random.randn(25).reshape(5,5), columns=list('abcde')) 
In [149]: df 
Out[149]: 
      a   b   c   d   e 
0 -1.484887 2.204350 0.498393 0.003432 0.792417 
1 -0.595458 0.850336 0.286450 0.201722 1.699081 
2 -0.437681 -0.907156 0.514573 -1.162837 -0.334180 
3 -0.160818 -0.384901 0.076484 0.599763 1.923360 
4 0.351161 0.519289 1.727934 -1.232707 0.007984

例如當你想在一個給定的行的所有列比-1

In [153]: df.iloc[ [row for row in df.index if all(df.loc[row] > -1)], :] 
Out[153]: 
      a   b   c   d   e 
1 -0.595458 0.850336 0.286450 0.201722 1.699081 
3 -0.160818 -0.384901 0.076484 0.599763 1.923360

更大

例如，您希望給定行中的任何列大於-1

In [154]: df.iloc[ [row for row in df.index if any(df.loc[row] > -1)], :] 
Out[154]: 
      a   b   c   d   e 
0 -1.484887 2.204350 0.498393 0.003432 0.792417 
1 -0.595458 0.850336 0.286450 0.201722 1.699081 
2 -0.437681 -0.907156 0.514573 -1.162837 -0.334180 
3 -0.160818 -0.384901 0.076484 0.599763 1.923360 
4 0.351161 0.519289 1.727934 -1.232707 0.007984

來源

2014-11-21 15:04:44 dmb

從數據框中選擇行，其中任一列高0.001

回答

相關問題