熊貓計算邏輯或基於多個列

這是一個DataFrame我的工作：熊貓計算邏輯或基於多個列

{'[email protected]': ['AAA', nan, nan], 
'[email protected]': [nan, 'BBB', nan], 
'[email protected]': [nan, nan, 'CCC'], 
'[email protected]':[1,nan,nan], 
'[email protected]':[nan,2,nan], 
'[email protected]':[nan,nan, 3] 
}

我想創建一個Recipe列，其中將包括基於行的值不NaN配方。例如，第一行的值將是AAA，第二行 - BBB等。在DF中還有其他列，但Recipe列應僅考慮到上述3個列。

來源

2015-12-26 Felix

因此，每行中總是隻有一個非nan值，並且您希望在新列中使用該值？ – itzy

這是正確的。謝謝 – Felix

你可以在這個例子中使用'df.max（）'，但你可能正在尋找一個更通用的解決方案。 – itzy

您可以使用apply和axis=1申請與any方法的行，如果您只有一個有效值，並且所有其他都是NaN（使用@Stefan例如）：

In [197]: df 
Out[197]: 
    [email protected] [email protected] [email protected] other_col 
0  AAA  NaN  NaN   1 
1  NaN  BBB  NaN   2 
2  NaN  NaN  CCC   3 

In [199]: df['new'] = df[['[email protected]', '[email protected]', '[email protected]']].apply(lambda x: x.any(), axis=1) 

In [200]: df 
Out[200]: 
    [email protected] [email protected] [email protected] other_col new 
0  AAA  NaN  NaN   1 AAA 
1  NaN  BBB  NaN   2 BBB 
2  NaN  NaN  CCC   3 CCC

編輯

這是一個看起來有點像一個黑客，但我認爲應該工作（主叫min如果dtype是數字或者any）：

df['new'] = df[['[email protected]', '[email protected]', '[email protected]']].apply(lambda x: x.min() if x.dtype.kind in 'biufc' else x.any(), axis=1) 

In [551]: df 
Out[551]: 
    [email protected] [email protected] [email protected] [email protected] [email protected] \ 
0    1   NaN   NaN  AAA  NaN 
1   NaN    2   NaN  NaN  BBB 
2   NaN   NaN    3  NaN  NaN 

    [email protected] new 
0  NaN 1 
1  NaN 2 
2  CCC 3

備註：dtype.kind

來源

2015-12-26 18:27:22

謝謝你，安東。完美的作品。 – Felix

Anton，這個解決方案可以很好地處理字符串。數值不穩定。需要在您的解決方案中更改哪些內容才能從行值中提取數值？謝謝 – Felix

你能舉個適當的例子嗎？你可以用'astype（str）'將所有內容轉換爲字符串，但這不是一個好方法... –

一個簡單的解決辦法是：

df = pd.DataFrame({'[email protected]': ['AAA', np.nan, np.nan], '[email protected]': [np.nan, 'BBB', np.nan], '[email protected]': [np.nan, np.nan, 'CCC'], 'other_col': [1, 2, 3]}) 

    [email protected] [email protected] [email protected] other_col 
0  AAA  NaN  NaN   1 
1  NaN  BBB  NaN   2 
2  NaN  NaN  CCC   3

只是通過rows迭代，並使用.dropna擺脫缺失值，你可以寫一個新的DataFrame列像這樣的：

for i, data in df.iterrows(): 
    df.loc[i, 'Recipe'] = data[['[email protected]', '[email protected]', '[email protected]']].dropna().values[0] 

    [email protected] [email protected] [email protected] other_col Recipe 
0  AAA  NaN  NaN   1 AAA 
1  NaN  BBB  NaN   2 BBB 
2  NaN  NaN  CCC   3 CCC

來源

2015-12-26 01:53:13 Stefan

謝謝Stefan，但這似乎不起作用。我想獲得食譜欄作爲DF欄 – Felix

你得到的錯誤是什麼？用「熊貓0.17.1」爲我工作。 – Stefan

DF沒有配方欄。數據框仍然需要包含'Recipe @ 123'，'Recipe @ 234'和'Recipe @ 456' – Felix

熊貓計算邏輯或基於多個列

回答

相關問題