使用熊貓替換列中值基於其他兩列中的值

我試圖找到解決方案來解決我的問題，但總結得不多。請讓我知道它是否存在其他地方。使用熊貓替換列中值基於其他兩列中的值

我有4列的數據幀，這樣的：

'A' 'B' 'C'  'D' 

cheese 5  grapes 7 
grapes 7  cheese 8 
steak 1  eggs  21 
eggs 2  steak  1

在「C」和「d」的條目必須在「A」和「B」值匹配，但不是由行;例如，如果「奶酪」在「B」中具有「5」，則「奶酪」在「D」中不能具有「8」。如果不匹配，則必須將「C」和「D」值更正爲默認值。在這種情況下，應該更正「奶酪」，以便C：默認和D：0。與雞蛋和葡萄一樣。不過，牛排很好。

所以輸出應該是這樣的：

'A' 'B' 'C'  'D' 
cheese 5 grapes 7 
grapes 7 default 0 
steak 1 default 0 
eggs 2 steak 1

我想「A」和「B」轉換成列表具有唯一值，然後試圖替換「C」和基於「d」值在名單上。我嘗試了所有可以在stackoverflow上找到的條件df.replace（）技巧，但沒有提供任何內容。

非常感謝您提供的任何幫助。

來源

2017-05-13 crimins

是有可能列'C'有兩行用'steak'？如果是這樣，代碼的行爲應該是什麼？ –

@ViníciusAguiar：是的，列'C'可以有多行且有任何條目。葡萄，牛排，雞蛋等都可以在'C'中多次出現，可能有多個相應的'D'值。數據是不可預測的骯髒。 A \ B'對是唯一的。代碼應找到所有不匹配「A \ B」對的「C/D」對，並將它們改正爲默認值\ 0。 – crimins

設置

df = pd.DataFrame({'A': {0: 'cheese', 1: 'grapes', 2: 'steak', 3: 'eggs'}, 
'B': {0: 5, 1: 7, 2: 1, 3: 2}, 
'C': {0: 'grapes', 1: 'default', 2: 'default', 3: 'steak'}, 
'D': {0: 7, 1: 0, 2: 0, 3: 1}}) 

df 
Out[1262]: 
     A B  C D 
0 cheese 5 grapes 7 
1 grapes 7 default 0 
2 steak 1 default 0 
3 eggs 2 steak 1

解決方案

#find rows where df.C should be set to default. 
df.C = df.apply(lambda x: x.C if ((x.C not in df.A.tolist()) or (x.D==df.loc[df.A==x.C, 'B'].iloc[0])) else 'default', axis=1) 
#set df.D to 0 for df.C == default 
df.loc[df.C=='default','D']=0 

df 
Out[1259]: 
     A B  C D 
0 cheese 5 grapes 7 
1 grapes 7 default 0 
2 steak 1 default 0 
3 eggs 2 steak 1

來源

2017-05-13 23:22:37 Allen

使用熊貓替換列中值基於其他兩列中的值

回答

相關問題