2017-09-17 38 views
0

一個數據幀專欄中,我有以下大熊貓據幀d1有條件改變基於其他列的值

+----------+-------+---------+--------------+ 
| Item Num | Cost | Revenue | Rev/Cost | 
+----------+-------+---------+--------------+ 
|  1 | 45.76 | 345.67 | 7.5539772727 | 
|  2 | 55.78 | 456.92 | 8.1914664754 | 
|  3 | 34.68 |  0 |   0 | 
|  4 | 79.85 |  0 |   0 | 
+----------+-------+---------+--------------+ 

我想對於Cost/Rev列的值等於Cost該行,乘以負1,在情況下,「成本/ REV」等於0

因此所需的輸出將是:

+----------+-------+---------+--------------+ 
| Item Num | Cost | Revenue | Rev/Cost | 
+----------+-------+---------+--------------+ 
|  1 | 45.76 | 345.67 | 7.5539772727 | 
|  2 | 55.78 | 456.92 | 8.1914664754 | 
|  3 | 34.68 |  0 |  -34.68 | 
|  4 | 79.85 |  0 |  -79.85 | 
+----------+-------+---------+--------------+ 

我至今是:

d1['Rev/Cost'] = d1['Rev/Cost'].apply(lambda x: x if x > 0 else d1['Cost']) 

簡單地覆蓋預期範圍與單個值並引發以下警告:

A value is trying to be set on a copy of a slice from a DataFrame. 
Try using .loc[row_indexer,col_indexer] = value instead 

回答

2

創建遮罩,然後用loc分配給子切片。

mask = df['Rev/Cost'] == 0 
df.loc[mask, 'Rev/Cost'] = df.loc[mask, 'Cost'].mul(-1) 
0

由於布爾計算爲0/1,你可以簡單地通過成本乘以條件和啓/成本減去它。這給了一個很好的性能提升。

df['Rev/Cost'] -= df['Cost'] * (df['Rev/Cost'] == 0) 

您還可以使用np.where

df['Rev/Cost'] = np.where(df['Rev/Cost'] == 0, -df['Cost'], df['Rev/Cost'] 

或者Series.where

df['Rev/Cost'] = df['Rev/Cost'].where(lambda x: x != 0, df.Cost)