2017-05-30 47 views
0

我想有一個腳本通過行去行中使用Python(熊貓)。我希望它如果列A ='東西'和列B> 25,然後寫入列C =「類別」。逐行(熊貓) - 如果列A =「東西」和列B> 25然後列C =「類別」

我有一個月列和列日。因此,舉例來說:

當月=月和日> = 25則周= 8月25日

我試了幾件事情,既不工作...

首先我想:

import os    ### OS library is imported. 
import pandas as pd  ### Pandas library is imporated as 'pd'. 

counter = 1    ### Counter starts at the first iteration. 

while os.path.exists("CSV-Iteration-"'{0}'"/".format(counter)):  ### Runs the loop until all iteration's folders have been processed. 

    a = pd.read_csv("output-"'{0}'".csv".format(counter))   ### Sets 'a' dataframe as holding data from a CSV file. 
    a['Week'] = "" 

    a[(a['Month'] is 'June') & (a['Day'] < 25)]['Week'] = 'June 18' 
    a[(a['Month'] is 'June') & (a['Day'] >= 25)]['Week'] = 'June 25' 
    a[(a['Month'] is 'July') & (a['Day'] < 2)]['Week'] = 'June 25' 
    a[(a['Month'] is 'July') & (a['Day'] >= 2) & (a['Day'] < 9)]['Week'] = 'July 2' 
    a[(a['Month'] is 'July') & (a['Day'] >= 9) & (a['Day'] < 16)]['Week'] = 'July 9' 
    a[(a['Month'] is 'July') & (a['Day'] >= 16) & (a['Day'] < 23)]['Week'] = 'July 16' 
    a[(a['Month'] is 'July') & (a['Day'] >= 23) & (a['Day'] < 30)]['Week'] = 'July 23' 
    a[(a['Month'] is 'July') & (a['Day'] >= 31) & (a['Day'] < 16)]['Week'] = 'July 30' 
    a[(a['Month'] is 'August') & (a['Day'] < 6)]['Week'] = 'July 30' 

    a[(a['Month'] is 'August') & (a['Day'] >= 6) & (a['Day'] < 13)]['Week'] = 'August 6' 
    a[(a['Month'] is 'August') & (a['Day'] >= 13) & (a['Day'] < 20)]['Week'] = 'August 13' 
    a[(a['Month'] is 'August') & (a['Day'] >= 20) & (a['Day'] < 27)]['Week'] = 'August 20' 
    a[(a['Month'] is 'August') & (a['Day'] >= 27)]['Week'] = 'August 27' 
    a[(a['Month'] is 'September') & (a['Day'] < 3)]['Week'] = 'August 27' 

    a[(a['Month'] is 'September') & (a['Day'] >= 3) & (a['Day'] < 10)]['Week'] = 'September 3' 
    a[(a['Month'] is 'September') & (a['Day'] >= 10) & (a['Day'] < 17)]['Week'] = 'September 10' 
    a[(a['Month'] is 'September') & (a['Day'] >= 17) & (a['Day'] < 24)]['Week'] = 'September 17' 
    a[(a['Month'] is 'September') & (a['Day'] >= 24)] = 'September 24' 

    a[(a['Month'] is 'October') & (a['Day'] >= 1) & (a['Day'] < 8)]['Week'] = 'October 1' 
    a[(a['Month'] is 'October') & (a['Day'] >= 8) & (a['Day'] < 15)]['Week'] = 'October 8' 
    a[(a['Month'] is 'October') & (a['Day'] >= 15) & (a['Day'] < 22)]['Week'] = 'October 15' 
    a[(a['Month'] is 'October') & (a['Day'] >= 22) & (a['Day'] < 29)]['Week'] = 'October 22' 
    a[(a['Month'] is 'October') & (a['Day'] >= 29)]['Week'] = 'October 29' 
    a[(a['Month'] is 'November') & (a['Day'] < 5)]['Week'] = 'October 29' 

    a[(a['Month'] is 'November') & (a['Day'] >= 5) & (a['Day'] < 12)]['Week'] = 'November 5' 
    a[(a['Month'] is 'November') & (a['Day'] >= 12) & (a['Day'] < 19)]['Week'] = 'November 12' 
    a[(a['Month'] is 'November') & (a['Day'] >= 19) & (a['Day'] < 26)]['Week'] = 'November 19' 
    a[(a['Month'] is 'November') & (a['Day'] >= 26)]['Week'] = 'November 26' 
    a[(a['Month'] is 'December') & (a['Day'] < 3)]['Week'] = 'November 26' 

    a[(a['Month'] is 'December') & (a['Day'] >= 3) & (a['Day'] < 10)]['Week'] = 'December 3' 
    a[(a['Month'] is 'December') & (a['Day'] >= 10) & (a['Day'] < 17)]['Week'] = 'December 10' 
    a[(a['Month'] is 'December') & (a['Day'] >= 17) & (a['Day'] < 24)]['Week'] = 'December 17' 
    a[(a['Month'] is 'December') & (a['Day'] >= 24) & (a['Day'] < 31)]['Week'] = 'December 24' 
    a[(a['Month'] is 'December') & (a['Day'] >= 31)]['Week'] = 'December 31' 
    a[(a['Month'] is 'January') & (a['Day'] < 7)]['Week'] = 'December 31' 

    a[(a['Month'] is 'January') & (a['Day'] >= 7) & (a['Day'] < 14)]['Week'] = 'January 7' 
    a[(a['Month'] is 'January') & (a['Day'] >= 14) & (a['Day'] < 21)]['Week'] = 'January 14' 
    a[(a['Month'] is 'January') & (a['Day'] >= 21) & (a['Day'] < 28)]['Week'] = 'January 21' 
    a[(a['Month'] is 'January') & (a['Day'] >= 28)]['Week'] = 'January 28' 

    a.to_csv("TESToutput-"'{0}'".csv".format(counter), index=False)   ### 'a' dataframe becomes 'TESToutput-#.csv' and does not print fields for indexing (index=False). 

    counter += 1  ### Adds 1 to the counter. 

print 'Date Corrections - All Done!' 

然後我嘗試:

import os    ### OS library is imported. 
import pandas as pd  ### Pandas library is imporated as 'pd'. 

counter = 1    ### Counter starts at the first iteration. 

while os.path.exists("CSV-Iteration-"'{0}'"/".format(counter)):  ### Runs the loop until all iteration's folders have been processed. 

    a = pd.read_csv("output-"'{0}'".csv".format(counter))   ### Sets 'a' dataframe as holding data from a CSV file. 
    a['Week'] = "" 

    def this_week (row): 
     if row[(a['Month'] is 'June') + (a['Day'] < 25)]: 
      return 'June 18' 
     if row[(a['Month'] is 'June') + (a['Day'] >= 25)]: 
      return 'June 25' 
     if row[(a['Month'] is 'July') + (a['Day'] < 2)]: 
      return 'June 25' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 2) + (a['Day'] < 9)]: 
      return 'July 2' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 9) + (a['Day'] < 16)]: 
      return 'July 9' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 16) + (a['Day'] < 23)]: 
      return 'July 16' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 23) + (a['Day'] < 30)]: 
      return 'July 23' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 31) + (a['Day'] < 16)]: 
      return 'July 30' 
     if row[(a['Month'] is 'August') + (a['Day'] < 6)]: 
      return 'July 30' 
     if row[(a['Month'] is 'August') + (a['Day'] >= 6) + (a['Day'] < 13)]: 
      return 'August 6' 
     if row[(a['Month'] is 'August') + (a['Day'] >= 13) + (a['Day'] < 20)]: 
      return 'August 13' 
     if row[(a['Month'] is 'August') + (a['Day'] >= 20) + (a['Day'] < 27)]: 
      return 'August 20' 
     if row[(a['Month'] is 'August') + (a['Day'] >= 27)]: 
      return 'August 27' 
     if row[(a['Month'] is 'September') + (a['Day'] < 3)]: 
      return 'August 27' 
     if row[(a['Month'] is 'September') + (a['Day'] >= 3) + (a['Day'] < 10)]: 
      return 'September 3' 
     if row[(a['Month'] is 'September') + (a['Day'] >= 10) + (a['Day'] < 17)]: 
      return 'September 10' 
     if row[(a['Month'] is 'September') + (a['Day'] >= 17) + (a['Day'] < 24)]: 
      return 'September 17' 
     if row[(a['Month'] is 'September') + (a['Day'] >= 24)]: 
      return 'September 24' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 1) + (a['Day'] < 8)]: 
      return 'October 1' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 8) + (a['Day'] < 15)]: 
      return 'October 8' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 15) + (a['Day'] < 22)]: 
      return 'October 15' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 22) + (a['Day'] < 29)]: 
      return 'October 22' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 29)]: 
      return 'October 29' 
     if row[(a['Month'] is 'November') + (a['Day'] < 5)]: 
      return 'October 29' 
     if row[(a['Month'] is 'November') + (a['Day'] >= 5) + (a['Day'] < 12)]: 
      return 'November 5' 
     if row[(a['Month'] is 'November') + (a['Day'] >= 12) + (a['Day'] < 19)]: 
      return 'November 12' 
     if row[(a['Month'] is 'November') + (a['Day'] >= 19) + (a['Day'] < 26)]: 
      return 'November 19' 
     if row[(a['Month'] is 'November') + (a['Day'] >= 26)]: 
      return 'November 26' 
     if row[(a['Month'] is 'December') + (a['Day'] < 3)]: 
      return 'November 26' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 3) + (a['Day'] < 10)]: 
      return 'December 3' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 10) + (a['Day'] < 17)]: 
      return 'December 10' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 17) + (a['Day'] < 24)]: 
      return 'December 17' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 24) + (a['Day'] < 31)]: 
      return 'December 24' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 31)]: 
      return 'December 31' 
     if row[(a['Month'] is 'January') + (a['Day'] < 7)]: 
      return 'December 31' 
     if row[(a['Month'] is 'January') + (a['Day'] >= 7) + (a['Day'] < 14)]: 
      return 'January 7' 
     if row[(a['Month'] is 'January') + (a['Day'] >= 14) + (a['Day'] < 21)]: 
      return 'January 14' 
     if row[(a['Month'] is 'January') + (a['Day'] >= 21) + (a['Day'] < 28)]: 
      return 'January 21' 
     if row[(a['Month'] is 'January') + (a['Day'] >= 28)]: 
      return 'January 28' 

    a['Week'] = a.apply (lambda row: this_week (row), axis=1) 

    a.to_csv("TESToutput-"'{0}'".csv".format(counter), index=False)   ### 'a' dataframe becomes 'TESToutput-#.csv' and does not print fields for indexing (index=False). 

    counter += 1  ### Adds 1 to the counter. 

print 'Date Corrections - All Done!' 

第二個給了我這個錯誤:「IndexingError:('聯合國提供可對準的布爾系列鑰匙」,在索引0' u'occurred)」

我很新的Python的,所以我把這些結合在一起基於我在論壇上看過。請讓我知道是否有一個更簡單的方法來做到這一點,或者如果有更正或補充,使這兩個腳本之一工作。

謝謝!

============================================== ======

BREAK - 最新資訊低於

這是一個數據文件的樣子(減去多餘的列)。

Month  Day C.Sym F.Sym D.Sym 
September 3 1    1 
September 27 1  
October  14   1 
October  15   1 
October  17   1 
October  21   1 
October  29   1 
November 30 1    
December 16   1  1 
December 17   1   
December 27   1 
January  6   1 
January  8 1    
January  20   1 

我想添加一個檢查月份和日期列以分配「周」IE的列。下圖:

Month  Day C.Sym F.Sym D.Sym Week 
September 3 1    1  Sept 3 
September 27 1      Sept 24 
October  14   1    Oct 8 
October  15   1    Oct 15 
October  17   1    Oct 15 
October  21   1    Oct 15 
October  29   1    Oct 29 
November 30 1      Oct 29 
December 16   1  1  Dec 10 
December 17   1    Dec 10 
December 27   1    Dec 24 
January  6   1    Dec 31 
January  8 1      Jan 7 
January  20   1    Jan 14 

的elif的,我想現在納入的一個例子:

elif a[(a['Month'] is 'January') & (a['Day'] >= 14) & (a['Day'] < 21)]: 
     ['Week'] = 'January 14' 

我希望這是更具體的幫助...

回答

0
>>> df = pd.DataFrame({'column_A': ['something', 'day', 'something'], 'column_B' : [30, 40, 10]}) 
>>> df 
    column_A column_B 
0 something  30 
1  day  40 
2 something  10 


>>> df = df.assign(column_C=((df.column_A == 'something') & (df.column_B > 25))) 
>>> df.column_C.replace(True, 'something', inplace=True) 
>>> df 
    column_A column_B column_C 
0 something  30 something 
1  day  40  False 
2 something  10  False 
+0

我的問題我遇到的問題是我的答案需要超過True/False。我需要能夠把「6月18日」,「6月25日」,「7月2日」等取決於Column_Month和Column_Day值。現在,我想我需要運行一個內部的多個ifelse查詢另一個開始的最早日期和結束最晚日期實現我的目標......還是我失去了「東西」這個答案嗎? –

+0

@DavidMills你必須具體。如果你發佈了你的數據目前的樣子和你想要的輸出結果,你更有可能得到解決問題的答案。 – spies006

+0

我對上面的信息添加了註釋,數據看起來像什麼以及我想要的輸出是什麼。謝謝! –