2017-10-11 54 views
0

我正在尋找合併以下Dataframe,如下所示。 DF根據條件合併並轉換pd.df

 Expiry   F  K Type  sigma 
0 2017-10-27 125.109375 123.5 P 0.045410 
1 2017-10-27 125.109375 127.5 P 0.047965 
2 2017-10-27 125.109375 124.5 P 0.041822 
3 2017-10-27 125.109375 125.5 P 0.041526 
4 2017-10-27 125.109375 120.5 P 0.045410 
5 2017-10-27 125.109375 121.5 P 0.045410 
6 2017-10-27 125.109375 121.0 P 0.045410 
7 2017-10-27 125.109375 122.0 P 0.045410 
8 2017-10-27 125.109375 123.0 P 0.045410 
9 2017-10-27 125.109375 124.0 P 0.043341 
10 2017-10-27 125.109375 125.0 P 0.041143 
11 2017-10-27 125.109375 126.0 P 0.043123 
12 2017-10-27 125.109375 127.0 P 0.047965 
13 2017-10-27 125.109375 128.0 P 0.047965 
14 2017-10-27 125.109375 128.5 P 0.047965 
15 2017-10-27 125.109375 129.0 P 0.047965 
16 2017-10-27 125.109375 129.5 P 0.047965 
17 2017-10-27 125.109375 130.0 P 0.047965 
18 2017-10-27 125.109375 126.5 P 0.046020 
19 2017-10-27 125.109375 122.5 P 0.045410 
20 2017-10-27 125.109375 123.5 C 0.045410 
21 2017-10-27 125.109375 127.5 C 0.047965 
22 2017-10-27 125.109375 124.5 C 0.041822 
23 2017-10-27 125.125000 125.5 C 0.041629 
24 2017-10-27 125.125000 120.5 C 0.045487 
25 2017-10-27 125.125000 121.5 C 0.045487 
26 2017-10-27 125.125000 121.0 C 0.045487 
27 2017-10-27 125.125000 122.0 C 0.045487 
28 2017-10-27 125.125000 123.0 C 0.045487 
29 2017-10-27 125.125000 124.0 C 0.043292 
..  ...   ... ... ...  ... 
70 2017-11-03 125.109375 125.0 C 0.040830 
71 2017-11-03 125.109375 126.0 C 0.042517 
72 2017-11-03 125.109375 127.0 C 0.046631 
73 2017-11-03 125.109375 128.0 C 0.046631 
74 2017-11-03 125.109375 128.5 C 0.046631 
75 2017-11-03 125.109375 129.0 C 0.046631 
76 2017-11-03 125.109375 129.5 C 0.046631 
77 2017-11-03 125.109375 130.0 C 0.046631 
78 2017-11-03 125.109375 126.5 C 0.044948 
79 2017-11-03 125.109375 122.5 C 0.044366 
80 2017-10-20 125.109375 123.5 P 0.046512 
81 2017-10-20 125.109375 127.5 P 0.048400 
82 2017-10-20 125.109375 124.5 P 0.041512 
83 2017-10-20 125.109375 125.5 P 0.042744 
84 2017-10-20 125.109375 120.5 P 0.046512 
85 2017-10-20 125.109375 121.5 P 0.046512 
86 2017-10-20 125.109375 121.0 P 0.046512 
87 2017-10-20 125.109375 122.0 P 0.046512 
88 2017-10-20 125.109375 123.0 P 0.046512 
89 2017-10-20 125.109375 124.0 P 0.044166 
90 2017-10-20 125.109375 125.0 P 0.041220 
91 2017-10-20 125.109375 126.0 P 0.045406 
92 2017-10-20 125.109375 127.0 P 0.048400 
93 2017-10-20 125.109375 128.0 P 0.048400 
94 2017-10-20 125.109375 128.5 P 0.048400 
95 2017-10-20 125.109375 129.0 P 0.048400 
96 2017-10-20 125.109375 129.5 P 0.048400 
97 2017-10-20 125.109375 130.0 P 0.048400 
98 2017-10-20 125.109375 126.5 P 0.048400 
99 2017-10-20 125.109375 122.5 P 0.046512 

西格瑪系統會根據選擇下列條件: 當F> k上使用了西格瑪類型= P 使用別的西格瑪型= C 我尋找的結果應該是這樣的例子:

   123 123.5 124 124.5 125 125.5 126 126.5 
Expiry                   
2017-10-20 0.051 0.047 0.043 0.04 0.040 0.039 0.041 0.043 
2017-10-27 0.045 0.041 0.041 0.04 0.039 0.039 0.040 0.042 
..... 

您的幫助將不勝感激。

回答

1
# create auxiliary column to check condition option 1 
df['to_use'] = df.apply(lambda x: 1 if (x['F'] > x['K'] and x['Type'] == 'P') or x['Type'] == 'C' else 0, axis = 1) 
# create auxiliary column to check condition option 2 (fastest option) 
df['to_use'] = (df['F'] > df['K'])*1*(df['Type'] == 'P')*1 + (df['Type'] == 'C')*1 

# sort data to place condition match values on top and P-type higher than C-type 
df.sort_values(by = ['Expiry', 'K', 'to_use', 'Type'], ascending = [True, True, False, False], inplace = True) 

# leave only values matching to condition (if P-type than it'll be higher otherwise C-type 
new_df = df.drop_duplicates(subset = ['Expiry', 'K'], keep = 'first') 

# now we are done to present result as a pivot table 
new_df.pivot_table(index='Expiry', columns='K', values = 'sigma').reset_index() 
+0

非常感謝。很好地工作。話雖如此,我可能會重新寫入布爾檢查的第一行,以不使用在我的經驗是緩慢的應用程序。 – steff

+0

你能接受我的解決方案嗎? =)謝謝! – paveltr

+0

另外我添加了條件 – paveltr