1
我目前正在使用以下代碼來編號。對於dataframe
中的每個元素,我都設置了一些要總結的條件,但它是創建報告中最慢的部分。是否有更快的方法來識別數據框中以特定字符串開頭的所有元素?查找以字符串開頭的數據框列中的所有元素
for idx, eachRecord in attributionCalcDF.T.iteritems():
if (attributionCalcDF['SEC_ID'].ix[idx] == 0):
currentGroup = lambda x: str(x).startswith(attributionCalcDF['GROUP_LIST'].ix[idx])
currentGroupArray = attributionCalcDF['GROUP_LIST'].map(currentGroup)
attributionCalcDF['ROLLUP_DAILY_TIMING_IMPACT'].ix[idx] = (
attributionCalcDF['DAILY_TIMING_IMPACT'][(attributionCalcDF['SEC_ID'] != 0) &
(currentGroupArray) &
(attributionCalcDF['START_DATE'] == attributionCalcDF['START_DATE'].ix[idx])].sum())
attributionCalcDF['ROLLUP_DAILY_STOCK_TO_GROUP_IMPACT'].ix[idx] = (
attributionCalcDF['DAILY_STOCK_TO_GROUP_IMPACT'][(attributionCalcDF['SEC_ID'] != 0) &
(currentGroupArray) &
(attributionCalcDF['START_DATE'] == attributionCalcDF['START_DATE'].ix[idx])].sum())
非常感謝Wes。將此功能的時間縮短了40%。矢量化的字符串函數?...非常期待它! –