2017-10-11 177 views
1

我有一個數據幀創建基於條件

pd.DataFrame({"A":[0,1,0,1], 
      "B":[-1,0,0,0], 
      "C":[0,0,0,0]}, 
     index = [.1,.2,.3, .4]) 

我第一次在邏輯上接近問題的方法

for index, row in iterrows(): 
    if df['A'] == 1: 
     df['C'] == 1 
    elif df['B'] == -1 
     df['C'] == -1 
    else: 
     df['C'] == 0 

我想

pd.DataFrame({"A":[0,1,0,1], 
      "B":[-1,0,0,0], 
      "C":[-1,1,0,1]}, 
     index = [.1,.2,.3, .4]) 

後的數據幀一列嘗試第一種方法時,我嘗試了其他問題中提出的各種方法,但沒有一種方法適合我的問題。

回答

2

你可以使用嵌套調用np.where

df.C = np.where(df.A == 1, 1, np.where(df.B == -1, -1, 0)) 
df 
    A B C 
0.1 0 -1 -1 
0.2 1 0 1 
0.3 0 0 0 
0.4 1 0 1 

性能

df = pd.concat([df] * 100000) 

%timeit np.select([df.A == 1, df.B == -1], [1, -1]) 
100 loops, best of 3: 5.25 ms per loop 

%timeit np.where(df.A == 1, 1, np.where(df.B == -1, -1, 0)) 
100 loops, best of 3: 2.86 ms per loop 
2

使用numpy.select

df['C'] = pd.np.select([df.A == 1, df.B == -1], [1, -1]) 

df 
#  A B C 
#0.1 0 -1 -1 
#0.2 1 0 1 
#0.3 0 0 0 
#0.4 1 -1 1