2017-05-29 71 views
1

我有熊貓沒有太多的經驗,另一列的電流值,和我有以下數據框:如何計算列依賴於一個以前的價值觀和

month  A    B 
2/28/2017 0.7377573034 0 
3/31/2017 0.7594787565 3.7973937824 
4/30/2017 0.7508308808 3.7541544041 
5/31/2017 0.7038814004 7.0388140044 
6/30/2017 0.6920212254 11.0723396061 
7/31/2017 0.6801610503 11.5627378556 
8/31/2017 0.6683008753 10.6928140044 
9/30/2017 0.7075915026 11.3214640415 
10/31/2017 0.6989436269 7.6883798964 
11/30/2017 0.6259514607 4.3816602247 
12/31/2017 0.6119757303 3.671854382 
1/31/2018 0.633   3.798 
2/28/2018 0.598   4.784 
3/31/2018 0.673   5.384 
4/30/2018 0.673   1.346 
5/31/2018 0.609   0 
6/30/2018 0.609   0 
7/31/2018 0.609   0 
8/31/2018 0.609   0 
9/30/2018 0.673   0 
10/31/2018 0.673   0 
11/30/2018 0.598   0 
12/31/2018 0.598   0 

我需要計算列C這基本上是列A次列B,但列B的值是相應月份的前一年的值。另外,對於前一年沒有相應月份的值,該值應該爲零。更具體地講,這是我所期望C是:

C 
0 # these values are zero because the corresponding month in the previous year is not in column A 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0    # 0.598 * 0 
2.5556460155552 # 0.673 * 3.7973937824 
2.5265459139593 # 0.673 * 3.7541544041 
4.2866377286796 # 0.609 * 7.0388140044 
6.7430548201149 # 0.609 * 11.0723396061 
7.0417073540604 # 0.609 * 11.5627378556 
6.5119237286796 # 0.609 * 10.6928140044 
7.6193452999295 # 0.673 * 11.3214640415 
5.1742796702772 # 0.673 * 7.6883798964 
2.6202328143706 # 0.598 * 4.3816602247 
2.195768920436 # 0.598 * 3.671854382 

我怎樣才能做到這一點?我相信可能有辦法做到這一點,而不是使用for循環。提前致謝。

回答

1
In [73]: (df.drop('B',1) 
    ...: .merge(df.drop('A',1) 
    ...:   .assign(month=df.month + pd.offsets.MonthEnd(12)), 
    ...:   on='month', how='left') 
    ...: .eval("C = A * B", inplace=False) 
    ...: .fillna(0) 
    ...:) 
    ...: 
Out[73]: 
     month   A   B   C 
0 2017-02-28 0.737757 0.000000 0.000000 
1 2017-03-31 0.759479 0.000000 0.000000 
2 2017-04-30 0.750831 0.000000 0.000000 
3 2017-05-31 0.703881 0.000000 0.000000 
4 2017-06-30 0.692021 0.000000 0.000000 
5 2017-07-31 0.680161 0.000000 0.000000 
6 2017-08-31 0.668301 0.000000 0.000000 
7 2017-09-30 0.707592 0.000000 0.000000 
8 2017-10-31 0.698944 0.000000 0.000000 
9 2017-11-30 0.625951 0.000000 0.000000 
10 2017-12-31 0.611976 0.000000 0.000000 
11 2018-01-31 0.633000 0.000000 0.000000 
12 2018-02-28 0.598000 0.000000 0.000000 
13 2018-03-31 0.673000 3.797394 2.555646 
14 2018-04-30 0.673000 3.754154 2.526546 
15 2018-05-31 0.609000 7.038814 4.286638 
16 2018-06-30 0.609000 11.072340 6.743055 
17 2018-07-31 0.609000 11.562738 7.041707 
18 2018-08-31 0.609000 10.692814 6.511924 
19 2018-09-30 0.673000 11.321464 7.619345 
20 2018-10-31 0.673000 7.688380 5.174280 
21 2018-11-30 0.598000 4.381660 2.620233 
22 2018-12-31 0.598000 3.671854 2.195769 

說明:

我們可以生成一個輔助DF這樣的(我們增加了12個月的month柱和下降A列):

In [77]: df.drop('A',1).assign(month=df.month + pd.offsets.MonthEnd(12)) 
Out[77]: 
     month   B 
0 2018-02-28 0.000000 
1 2018-03-31 3.797394 
2 2018-04-30 3.754154 
3 2018-05-31 7.038814 
4 2018-06-30 11.072340 
5 2018-07-31 11.562738 
6 2018-08-31 10.692814 
7 2018-09-30 11.321464 
8 2018-10-31 7.688380 
9 2018-11-30 4.381660 
10 2018-12-31 3.671854 
11 2019-01-31 3.798000 
12 2019-02-28 4.784000 
13 2019-03-31 5.384000 
14 2019-04-30 1.346000 
15 2019-05-31 0.000000 
16 2019-06-30 0.000000 
17 2019-07-31 0.000000 
18 2019-08-31 0.000000 
19 2019-09-30 0.000000 
20 2019-10-31 0.000000 
21 2019-11-30 0.000000 
22 2019-12-31 0.000000 

現在我們可以用它合併原DF(我們不需要原始DF中的B列):

In [79]: (df.drop('B',1) 
    ...: .merge(df.drop('A',1) 
    ...:    .assign(month=df.month + pd.offsets.MonthEnd(12)), 
    ...:   on='month', how='left')) 
Out[79]: 
     month   A   B 
0 2017-02-28 0.737757  NaN 
1 2017-03-31 0.759479  NaN 
2 2017-04-30 0.750831  NaN 
3 2017-05-31 0.703881  NaN 
4 2017-06-30 0.692021  NaN 
5 2017-07-31 0.680161  NaN 
6 2017-08-31 0.668301  NaN 
7 2017-09-30 0.707592  NaN 
8 2017-10-31 0.698944  NaN 
9 2017-11-30 0.625951  NaN 
10 2017-12-31 0.611976  NaN 
11 2018-01-31 0.633000  NaN 
12 2018-02-28 0.598000 0.000000 
13 2018-03-31 0.673000 3.797394 
14 2018-04-30 0.673000 3.754154 
15 2018-05-31 0.609000 7.038814 
16 2018-06-30 0.609000 11.072340 
17 2018-07-31 0.609000 11.562738 
18 2018-08-31 0.609000 10.692814 
19 2018-09-30 0.673000 11.321464 
20 2018-10-31 0.673000 7.688380 
21 2018-11-30 0.598000 4.381660 
22 2018-12-31 0.598000 3.671854 

然後使用.eval("C = A * B", inplace=False)我們不能生成一個新的列「飛」

+0

@lmiguelvargasf,我已經更新了我的答案 - 請檢查 – MaxU

+0

這是一個很好的答案。 – lmiguelvargasf

+0

@lmiguelvargasf,我已經添加了一個解釋。我希望現在更清楚...... – MaxU