動態布爾掩碼在相同的值某些列熊貓

我有一個熊貓DataFrame其中某些行在相似的列中具有相同的值。當所有這些列的特定行具有相同的值時，我想創建一個布爾值掩碼爲True。我想動態傳遞一列列表來檢查。例如：動態布爾掩碼在相同的值某些列熊貓

A | B | C | Mask 
1 | 1 | 1 | True 
2 | 2 | 3 | False 
4 | 4 | 4 | True

該掩碼應該由我的same_values函數返回，該函數傳遞了DataFrame和一列列。例如

same_values(data, ['A', 'B', 'C'])

沒有動態通，我可以做這樣的：

data[(data['A']==data['B'])&(data['A']==data['C'])]

我可以動態地遍歷所有列，並將它們與第一列通過比較但這似乎效率不高。誰有更好的解決方案？

來源

2016-06-21 Jan van der Vegt

連接列A到C並檢查結果是否爲％111 == 0？ ☺ –

可以將所有df與第一列由eq與all比較：

print (df.eq(df.iloc[:,0], axis=0)) 
     A  B  C 
0 True True True 
1 True True False 
2 True True True 

print (df.eq(df.iloc[:,0], axis=0).all(axis=1)) 
0  True 
1 False 
2  True 
dtype: bool

如果需要比較只有幾列，使用子集：

L = ['A','B','C'] 
print (df[L].eq(df.iloc[:,0], axis=0).all(axis=1)) 
0  True 
1 False 
2  True 
dtype: bool

來源

2016-06-21 09:26:10 jezrael

與同事討論後，他向我指出這篇文章：

Get rows that have the same value across its columns in pandas

我試圖在這裏和鏈接上提到的兩種方法張貼和這裏的結果：

%timeit test1 = test[test.apply(pd.Series.nunique, axis=1)==1] 
1.23 s per loop 

%timeit test2 = test[test.eq(test['A'], axis='index').all(1)] 
3.47 ms per loop 

%timeit test3 = test[test.apply(lambda x: x.duplicated(keep=False).all(), axis=1)] 
2.3 s per loop 

%timeit test4 = test[test.apply(lambda x: x == (x.iloc[0]).all(), axis=1)] 
4.5 s per loop

來源

2016-06-21 09:30:23

帝斯曼的解決方案看起來非常優雅 – MaxU

你可以試試這個：

data = pd.DataFrame({'a': [1, 2, 4], 'b': [1, 2, 4], 'c': [1, 3, 4]}) 
data.apply(lambda x: len(set(x)) == 1, axis=1)

來源

2016-06-21 09:31:35 Shravan

多麼糟糕的是：

list_a = [1, 2, 4] 
list_b = [1, 2, 4] 
list_c = [1, 3, 4] 

longshot = [True if not x % 111 else False for x in list(map(lambda x: int(str(x[0])+str(x[1])+str(x[2])), list(zip(list_a, list_b, list_c))))] 
print(longshot) # [True, False, True]

來源

2016-06-21 09:36:39

動態布爾掩碼在相同的值某些列熊貓

回答

相關問題