熊貓條件語句問題

嗨〜我正在處理我的數據。熊貓條件語句問題

我想用條件語句提取數據

這是我的代碼。

# -*- coding: utf-8 -*- 
import pandas as pd 
import numpy as np 
import os 

join_file = r'D:\handling data\complete data\조인\after_join.csv' 
pwd = os.getcwd() 
os.chdir(os.path.dirname(join_file)) 
join_data = pd.read_csv(os.path.basename(join_file), sep=',', encoding='utf-8') 

print(join_data.head())

join_data['cluster_z'] = 4 # 둘다 하락세   
join_data['cluster_z'][((join_data['cluster_x'] == 3 | join_data['cluster_x'] == 2 | join_data['cluster_x'] == 4) 
        & (join_data['cluster_y'] == 3 | join_data['cluster_y'] == 1))] = 1 # 다 상승세 

join_data['cluster_z'][((join_data['cluster_x'] == 1 | join_data['cluster_x'] == 5) 
        & (join_data['cluster_y'] == 3 | join_data['cluster_y'] == 1))] = 2 # 전체 하락세, 점포당 상승세 

join_data['cluster_z'][((join_data['cluster_x'] == 3 | join_data['cluster_x'] == 2 | join_data['cluster_x'] == 4) 
        & (join_data['cluster_y'] == 2 | join_data['cluster_y'] == 4))] = 3 # 전체 상승세, 점파당 하락세 

print(join_data.head())

和執行第二打印後（join_data.head（））。我喜歡的圖片

我怎樣才能解決這個問題的錯誤？提前致謝。

來源

2017-02-21 김지영

看來你省去了很多括號的條件之間，也能更好的是使用loc：

原文：

join_data['cluster_z'] 
[((join_data['cluster_x'] == 3 | 
    join_data['cluster_x'] == 2 | 
    join_data['cluster_x'] == 4) & 
    (join_data['cluster_y'] == 3 | 
    join_data['cluster_y'] == 1))] = 1

更改爲：

join_data.loc[ 
((join_data['cluster_x'] == 3) | 
(join_data['cluster_x'] == 2) | 
(join_data['cluster_x'] == 4)) & 
((join_data['cluster_y'] == 3) | 
(join_data['cluster_y'] == 1)), 'cluster_z'] = 1

或者更好地利用isin：

join_data.loc[ 
(join_data['cluster_x'].isin([3,2,4])) & 
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 1

一起：

join_data = pd.DataFrame({'cluster_x':[3,2,5,3], 
         'cluster_y':[3,0,1,2]}) 

print (join_data) 
    cluster_x cluster_y 
0   3   3 
1   2   0 
2   5   1 
3   3   2 

join_data['cluster_z'] = 4 

join_data.loc[ 
(join_data['cluster_x'].isin([3,2,4])) & 
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 1 

join_data.loc[ 
(join_data['cluster_x'].isin([1,5])) & 
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 2 

join_data.loc[ 
(join_data['cluster_x'].isin([3,2,4])) & 
(join_data['cluster_y'].isin([2,4])), 'cluster_z'] = 3 

print (join_data) 
    cluster_x cluster_y cluster_z 
0   3   3   1 
1   2   0   4 
2   5   1   2 
3   3   2   3

或者更可讀：

mask1 = join_data['cluster_x'].isin([3,2,4]) 
mask2 = join_data['cluster_y'].isin([3,1]) 
mask3 = join_data['cluster_x'].isin([1,5]) 
mask4 = join_data['cluster_y'].isin([2,4]) 

join_data['cluster_z'] = 4 
join_data.loc[mask1 & mask2 , 'cluster_z'] = 1 
join_data.loc[mask3 & mask2 , 'cluster_z'] = 2 
join_data.loc[mask1 & mask4 , 'cluster_z'] = 3 

print (join_data) 
    cluster_x cluster_y cluster_z 
0   3   3   1 
1   2   0   4 
2   5   1   2 
3   3   2   3

解決方案與多個numpy.where：

mask1 = join_data['cluster_x'].isin([3,2,4]) 
mask2 = join_data['cluster_y'].isin([3,1]) 
mask3 = join_data['cluster_x'].isin([1,5]) 
mask4 = join_data['cluster_y'].isin([2,4]) 

join_data['cluster_z'] = np.where(mask1 & mask2, 1, 
         np.where(mask3 & mask2, 2, 
         np.where(mask1 & mask4, 3, 4)))   

print (join_data) 
    cluster_x cluster_y cluster_z 
0   3   3   1 
1   2   0   4 
2   5   1   2 
3   3   2   3

來源

2017-02-21 08:42:15 jezrael

謝謝~~你這麼大的傢伙！有很多方法來處理它。哈哈。你怎麼知道很多方法。謝謝~~ 有一個美好的一天~~ –

熊貓條件語句問題

回答

相關問題