2017-09-22 73 views
-1

我想將我的SQL代碼轉換成一個Python(熊貓)過濾器函數,但它給我一個很難。任何想法如何根據SQL條件過濾我的數據而不循環記錄? Desc ='Bla1'的差異。如何根據Python中的SQL條件過濾pandas DataFrame?

if joe_doe:保留記錄Hello = 1;其他:保留記錄與 Hello = 0

SQL

Hello = 
     CASE 
      WHEN 
      (
       Desc = 'Bla1' 
       AND Value = 'True' 
      ) 
      OR 
      (
       Desc IN('Bla2', 'Bla3') 
       AND Active = 'True'     
      ) 
      AND Enabled = 'True' 
      THEN 1 
      ELSE 0 

的Python(包括大熊貓)

def get_it(john_doe, df): 

    sentences = { 
      'Bla1': 'Value', 
      'Bla2': 'Active', 
      'Bla3': 'Active' 
     } 

    if john_doe: 
     df = df[HOW TO KEEP ALL RECORDS THAT HAVE Hello = 1?] 
    else: 
     df = df[HOW TO KEEP ALL RECORDS THAT HAVE Hello = 0?] 
    return df 

數據框中輸入

id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS] 
1 | Bla2 | 1  | 0  | 1  | [A LOT OF OTHER COLUMNS] 
2 | Bla3 | 1  | 1  | 1  | [A LOT OF OTHER COLUMNS] 
3 | Bla3 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
4 | Bla4 | 1  | 1  | 1  | [A LOT OF OTHER COLUMNS] 
5 | Bla6 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
6 | Bla7 | 0  | 0  | 1  | [A LOT OF OTHER COLUMNS] 
7 | Bla1 | 0  | 1  | 1  | [A LOT OF OTHER COLUMNS] 
8 | Bla1 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 

數據幀所需的輸出中爲ELSE IF JOE_DOE

id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS] 
2 | Bla3 | 1  | 1  | 1  | [A LOT OF OTHER COLUMNS] 
3 | Bla3 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
7 | Bla1 | 0  | 1  | 1  | [A LOT OF OTHER COLUMNS] 

數據幀所需的輸出中

id | Desc | Active | Enabled | Value | [A LOT OF OTHER COLUMNS] 
1 | Bla2 | 1  | 0  | 1  | [A LOT OF OTHER COLUMNS] 
4 | Bla4 | 1  | 1  | 1  | [A LOT OF OTHER COLUMNS] 
5 | Bla6 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
6 | Bla7 | 0  | 0  | 1  | [A LOT OF OTHER COLUMNS] 
8 | Bla1 | 1  | 1  | 0  | [A LOT OF OTHER COLUMNS] 
+0

這個問題很混亂 - 請問您能否提供您的df樣本?你是想模仿案例陳述,還是隻需要知道如何陳述if語句? –

+0

我想根據Python/pandas中的SQL條件篩選我的'df'。在'if'中,我想根據SQL('THEN 1')中的條件保留所有記錄。在'else'中,我想保留所有不符合SQL條件的記錄('ELSE 0') – orangetacos

+0

'句子'字典包含所有的SQL案例,因爲'Bla1'檢查'Value'字段。另外兩個檢查「Active」字段。 – orangetacos

回答

1

這樣的事情應該工作。熊貓可以採取任何數量的邏輯參數來過濾數據幀。 &|用於分隔參數,而~用於否定參數。我不明白你建立的dict的需要,我認爲在這種情況下是不必要的。

logic1 = (df.Desc=='Bla11') & (df.Value==1) & (df.Enabled==1) 
logic2 = (df.Desc=='Bla12') & (df.Active==1) & (df.Enabled==1) 
logic3 = (df.Desc=='Bla13') & (df.Active==1) & (df.Enabled==1) 

if joe_doe: 
    df = df[logic1 | logic2 | logic3] 
else: 
    df = df[~logic1 & ~logic2 & ~logic3] 
return df 
相關問題