2017-10-18 44 views
0

所以我有一個DataFrame與幾千行包含人工外匯交易數據。前十行是這樣的:大數據集上的循環次優

enter image description here

我想遍歷這個集合,併爲每一行,計算CommonCurrency在這種情況下將是美元。因此,對於每一行,我走在CurrencyPairDeskRateOrderQty列和計算CommonCurrency

for i in range(len(order_data)): 
    if (order_data['CurrencyPair'][i] == 'GBP/USD'): 
     order_data['CommonCurrency'][i] = order_data['DeskRate'][i] * 
     order_data['OrderQty'][i] 
    elif (order_data['CurrencyPair'][i] == 'AUD/USD'): 
     order_data['CommonCurrency'][i] = order_data['DeskRate'][i] * 
     order_data['OrderQty'][i] 
    elif (order_data['CurrencyPair'][i] == 'EUR/USD'): 
     order_data['CommonCurrency'][i] = order_data['DeskRate'][i] * 
     order_data['OrderQty'][i] 
    elif (order_data['CurrencyPair'][i] == 'USD/CHF'): 
     order_data['CommonCurrency'][i] = order_data['DeskRate'][i]/
     order_data['OrderQty'][i] 
    elif (order_data['CurrencyPair'][i] == 'EUR/GBP'): 
     order_data['CommonCurrency'][i] = #different calculation 

這似乎並不喜歡做的正確的方式,特別是沒有如果有大量不同的貨幣對。我遇到的另一個問題是,當我到達EUR/GBP時,因爲現在我必須同時獲得DeskRateEUR/USD,我看不出如何使用此方法。

任何提示?

回答

2

熊貓的一個有趣功能是indexing的概念。有這樣做,但使用loc的更Python的方式,你可以使用系列(列)將值分配給數據框的一部分:

order_data.loc[order_data['CurrencyPair'].isin(('GBP/USD', 'AUD/USD', 'EUR/USD')), 'CurrencyPair'] = order_data['DeskRate'] * order_data['OrderQty'] 
order_data.loc[order_data['CurrencyPair'] == 'USD/CHF', 'CurrencyPair'] = order_data['DeskRate']/order_data['OrderQty'] 
order_data.loc[order_data['CurrencyPair'] == 'EUR/GBP', 'CurrencyPair'] = some_func(order_data['DeskRate'], order_data['OrderQty']) 

從而避免任何for循環