Python的熊貓數據幀更新值efficently

我使用的是熊貓數據幀（N×N的），我想遍歷每行和每一個元素來檢查元素大於行的含義。如果大於我想要的元件值更改爲1

我使用計算平均值：

mean_value = df.ix[elementid].mean(axis=0)

但迭代的每個元素，並檢查它是否> = mean_value嵌套循環是真的很慢。

來源

2016-04-07 J-H

您正在訪問的每一個元素，是什麼讓你覺得你可以比做得更好O（納米）。 – Natecat

我只是希望在熊貓中有函數，如果元素大於平均值，則將值1逐行應用。 –

該函數完成與手動完成相同的操作。您正在更改數組的每個元素，因此您必須訪問數組中的每個元素。你不能做得更快 – Natecat

您可以首先通過行數mean，然後用ge比較和地方mask添加1：

print df 
    a b c 
0 0 1 2 
1 0 1 2 
2 1 1 2 
3 1 0 1 
4 1 1 2 
5 0 0 1 

mean_value = df.mean(axis=1) 
print mean_value 
0 1.000000 
1 1.000000 
2 1.333333 
3 0.666667 
4 1.333333 
5 0.333333 

mask = df.ge(mean_value, axis=0) 
print mask 
     a  b  c 
0 False True True 
1 False True True 
2 False False True 
3 True False True 
4 False False True 
5 False False True 
print df.mask(mask, 1) 
    a b c 
0 0 1 1 
1 0 1 1 
2 1 1 1 
3 1 0 1 
4 1 1 1 
5 0 0 1

來源

2016-04-07 18:03:49 jezrael

這就是'mask'和' ge'！ – Zero

非常優雅的解決方案+1 – MaxU

除了最終結果外，看起來不錯。你不只是想'df.mask（df.gt（df.mean（axis = 1）），1）'？ – Alexander

Python的熊貓數據幀更新值efficently

回答

相關問題