2017-03-29 28 views
1
>>> print(df) 
     out_0   out_1    b  c  e  d 
0 out_GFD_994 out_GFD_Part2 NSE_FO_BHP_1703 -27022 361.45 -13200 
1 out_GFD_994 out_GFD_Part2 NSE_FO_BHP_1704 30132 399.98 13200 
2 out_GFD_994 out_GFD_Part2 NSE_FO_MQG_1703 -145173 1177.97 -11200 
3 out_GFD_994 out_GFD_Part2 NSE_FO_MQG_1704 158747 1180.55 12300 
4 out_GFD_994 out_GFD_Part2 NSE_FO_TLS_1703 23316 1247.50 3200 
5 out_GFD_994 out_GFD_Part2 NSE_FO_TLS_1704 -19333 1253.62 -3200 

需要在列中的每個密鑰文件映像B到值 如果在文件密鑰LONG含有作爲子字符串(例如NSE_FO_BHP_1703在文件匹配到NSE_FO_BHP_1703_ *),然後減去在列d值 如果文件中的密鑰包含SHORT作爲子串,則在列d中添加值Map來大熊貓子密鑰

使得NSE_FO_BHP_1703_MAXLONGPOS = 200000 changes to 213200更改NSE_FO_BHP_1703_MAXSHORTPOS to 186800。分別更改MAXLONGEXPOSURE and MAXSHORTEXPOSURE to 263200 and 236800。 這將在df的b列中的每個鍵完成。

NSE_FO_BHP_1703_MAXLONGPOS = 200000 
NSE_FO_BHP_1703_MAXSHORTPOS = 200000 
NSE_FO_BHP_1703_MAXLONGEXPOSURE = 250000 
NSE_FO_BHP_1703_MAXSHORTEXPOSURE = 250000 
NSE_FO_BHP_1704_MAXLONGPOS = 200000 
NSE_FO_BHP_1704_MAXSHORTPOS = 200000 
NSE_FO_BHP_1704_MAXLONGEXPOSURE = 250000 
NSE_FO_BHP_1704_MAXSHORTEXPOSURE = 250000 
NSE_FO_TLS_1703_MAXLONGPOS = 100000 
NSE_FO_TLS_1703_MAXSHORTPOS = 100000 
NSE_FO_TLS_1703_MAXLONGEXPOSURE = 200000 
NSE_FO_TLS_1703_MAXSHORTEXPOSURE = 200000 
NSE_FO_TLS_1704_MAXLONGPOS = 100000 
NSE_FO_TLS_1704_MAXSHORTPOS = 100000 
NSE_FO_TLS_1704_MAXLONGEXPOSURE = 200000 
NSE_FO_TLS_1704_MAXSHORTEXPOSURE = 200000 
NSE_FO_MQG_1703_MAXLONGPOS = 300000 
NSE_FO_MQG_1703_MAXSHORTPOS = 300000 
NSE_FO_MQG_1703_MAXLONGEXPOSURE = 400000 
NSE_FO_MQG_1703_MAXSHORTEXPOSURE = 400000 
NSE_FO_DEF_1704_MAXLONGPOS = 300000 
NSE_FO_MQG_1704_MAXSHORTPOS = 300000 
NSE_FO_MQG_1704_MAXLONGEXPOSURE = 400000 
NSE_FO_MQG_1704_MAXSHORTEXPOSURE = 400000 
+0

這裏有一個提示:'5 + -4 == 5-4' –

回答

1

我認爲,如果所有的數據都在對你可以使用 - *LONG**SHORT*

from pandas.compat import StringIO 

temp=u""" 
NSE_FO_BHP_1703_MAXLONGPOS = 200000 
NSE_FO_BHP_1703_MAXSHORTPOS = 200000 
NSE_FO_BHP_1703_MAXLONGEXPOSURE = 250000 
NSE_FO_BHP_1703_MAXSHORTEXPOSURE = 250000 
NSE_FO_BHP_1704_MAXLONGPOS = 200000 
NSE_FO_BHP_1704_MAXSHORTPOS = 200000 
NSE_FO_BHP_1704_MAXLONGEXPOSURE = 250000 
NSE_FO_BHP_1704_MAXSHORTEXPOSURE = 250000 
NSE_FO_TLS_1703_MAXLONGPOS = 100000 
NSE_FO_TLS_1703_MAXSHORTPOS = 100000 
NSE_FO_TLS_1703_MAXLONGEXPOSURE = 200000 
NSE_FO_TLS_1703_MAXSHORTEXPOSURE = 200000 
NSE_FO_TLS_1704_MAXLONGPOS = 100000 
NSE_FO_TLS_1704_MAXSHORTPOS = 100000 
NSE_FO_TLS_1704_MAXLONGEXPOSURE = 200000 
NSE_FO_TLS_1704_MAXSHORTEXPOSURE = 200000 
NSE_FO_MQG_1703_MAXLONGPOS = 300000 
NSE_FO_MQG_1703_MAXSHORTPOS = 300000 
NSE_FO_MQG_1703_MAXLONGEXPOSURE = 400000 
NSE_FO_MQG_1703_MAXSHORTEXPOSURE = 400000 
NSE_FO_DEF_1704_MAXLONGPOS = 300000 
NSE_FO_MQG_1704_MAXSHORTPOS = 300000 
NSE_FO_MQG_1704_MAXLONGEXPOSURE = 400000 
NSE_FO_MQG_1704_MAXSHORTEXPOSURE = 400000""" 
#after testing replace 'StringIO(temp)' to 'filename.csv' 
df2 = pd.read_csv(StringIO(temp), sep="\s+=\s+", engine='python', names=['A','B']) 
#create dict for map 
d = df1.set_index('b')['d'].to_dict() 
#split by last _ and map 
a = df2.A.str.rsplit('_', n=1).str[0].map(d) 
#if LONG substract, else add 
df2['C'] = np.where(df2.A.str.contains('LONG'), df2.B - a, df2.B + a) 
print (df2) 
            A  B   C 
0   NSE_FO_BHP_1703_MAXLONGPOS 200000 213200.0 
1  NSE_FO_BHP_1703_MAXSHORTPOS 200000 186800.0 
2 NSE_FO_BHP_1703_MAXLONGEXPOSURE 250000 263200.0 
3 NSE_FO_BHP_1703_MAXSHORTEXPOSURE 250000 236800.0 
4   NSE_FO_BHP_1704_MAXLONGPOS 200000 186800.0 
5  NSE_FO_BHP_1704_MAXSHORTPOS 200000 213200.0 
6 NSE_FO_BHP_1704_MAXLONGEXPOSURE 250000 236800.0 
7 NSE_FO_BHP_1704_MAXSHORTEXPOSURE 250000 263200.0 
8   NSE_FO_TLS_1703_MAXLONGPOS 100000 96800.0 
9  NSE_FO_TLS_1703_MAXSHORTPOS 100000 103200.0 
10 NSE_FO_TLS_1703_MAXLONGEXPOSURE 200000 196800.0 
11 NSE_FO_TLS_1703_MAXSHORTEXPOSURE 200000 203200.0 
12  NSE_FO_TLS_1704_MAXLONGPOS 100000 103200.0 
13  NSE_FO_TLS_1704_MAXSHORTPOS 100000 96800.0 
14 NSE_FO_TLS_1704_MAXLONGEXPOSURE 200000 203200.0 
15 NSE_FO_TLS_1704_MAXSHORTEXPOSURE 200000 196800.0 
16  NSE_FO_MQG_1703_MAXLONGPOS 300000 311200.0 
17  NSE_FO_MQG_1703_MAXSHORTPOS 300000 288800.0 
18 NSE_FO_MQG_1703_MAXLONGEXPOSURE 400000 411200.0 
19 NSE_FO_MQG_1703_MAXSHORTEXPOSURE 400000 388800.0 
20  NSE_FO_DEF_1704_MAXLONGPOS 300000  NaN <-no data in df1 - map return NaN 
21  NSE_FO_MQG_1704_MAXSHORTPOS 300000 312300.0 
22 NSE_FO_MQG_1704_MAXLONGEXPOSURE 400000 387700.0 
23 NSE_FO_MQG_1704_MAXSHORTEXPOSURE 400000 412300.0 
+0

'NaN <-no data in df1 - map return NaN'如果c col值變成NaN,我們可以替換b col值 – pythonRcpp

+0

是的,使用'fillna' - 'df2 ['C'] = np.where(df2.A.str .contains('LONG'),df2.B - a,df2.B + a).fillna(0)' – jezrael

+0

AttributeError:'numpy.ndarray'o bject沒有屬性'fillna' – pythonRcpp