2016-10-21 161 views
0

我有一分鐘一熊貓數據框df。我正在尋找應用加權返回,並計算加權滾動標準偏差,與窗口= 10。我可以計算出非加權STD,以年率:Python - 計算加權滾動標準差

df_spy['10mVol'] = df_spy['Return'].rolling(center=False,window=10).std()*(1440*252)**(0.5)*100 

還有另外一個問題,問在numpy的加權STD ,但我對滾動加權stdev很好奇。 (Weighted standard deviation in NumPy?

用於計算加權標準差的公式是: https://math.stackexchange.com/questions/320441/standard-deviation-of-the-weighted-mean

weighting Midpoint Return  10mVol Weighted 
0.2   215.6700 NaN   NaN  NaN 
0.8   215.8400 -0.000788 NaN  -0.000630 
0.8   216.0600 -0.001019 NaN  -0.000815 

感謝您的幫助

回答

1

據我明白了,rolling方法後鏈式功能是接受一個函數一個數組並給出一個數字。該功能是爲每個窗口計算的。所以,如果我們有一個計算加權標準的函數,我們可以使用lambda函數來獲得滾動加權標準。這是我的要求。 (我希望沒有使加權-STD計算你提供一個錯誤)

import pandas as pd 
import numpy as np 


def weighted_std(values, weights): 
    # For simplicity, assume len(values) == len(weights) 
    # assume all weights > 0 
    sum_of_weights = np.sum(weights) 
    weighted_average = np.sum(values * weights)/sum_of_weights 
    n = len(weights) 
    numerator = np.sum(n * weights * (values - weighted_average) ** 2.0) 
    denominator = (n - 1) * sum_of_weights 
    weighted_std = np.sqrt(numerator/denominator) 
    return weighted_std 


def rolling_std(s, weights): 
    window_size = len(weights) 
    return s.rolling(center=False, window=window_size).apply(lambda win: weighted_std(win, weights)) 

s = pd.Series(np.random.random([10])) # generate random data 
w = np.array([1., 3., 5.]) # choose weights 
print(s.values) 
print(rolling_std(s, w).values) 

輸出示例:

[ 0.08101966 0.57133241 0.29491028 0.25139964 0.26151065 0.45768199 
    0.94459935 0.21534497 0.35999294 0.60242746] 
[  nan   nan 0.19701963 0.11936639 0.01539041 0.12097725 
    0.33346742 0.40784167 0.25884732 0.17709334] 

這裏lambda win: weighted_std(win, weights)是一個函數,它的陣列作爲輸入,並返回一個數。

+0

感謝您的快速反饋。當我將window_size更改爲比len(權重)更短的內容時,函數輸出操作數無法與形狀(5864,)(10,)一起廣播。否則,它輸出所有值的NaN。 – yusica