我想貢獻給這個線程I have developed myself的算法:
它是基於dispersion原理:如果一個新的數據點的標準偏差的一個給定的x個遠離一些移動平均的算法信號(也稱爲z-score)。該算法非常強大,因爲它構造了一個單獨的移動平均值和偏差,以便信號不會破壞閾值。因此,無論先前的信號量如何,未來的信號都具有大致相同的精度。該算法需要3個輸入:lag = the lag of the moving window
,threshold = the z-score at which the algorithm signals
和influence = the influence (between 0 and 1) of new signals on the mean and standard deviation
。例如,5中的lag
將使用最後5次觀察來平滑數據。如果數據點距移動平均值3.5個標準偏差,則3.5的一個threshold
將發信號。 0.5的influence
給出正常數據點具有的影響的信號一半。同樣地,0的influence
完全忽略了重新計算新閾值的信號:因此0的影響是最穩健的選項。
其工作原理如下:
僞
# Let y be a vector of timeseries data of at least length lag+2
# Let mean() be a function that calculates the mean
# Let std() be a function that calculates the standard deviaton
# Let absolute() be the absolute value function
# Settings (the ones below are examples: choose what is best for your data)
set lag to 5; # lag 5 for the smoothing functions
set threshold to 3.5; # 3.5 standard deviations for signal
set influence to 0.5; # between 0 and 1, where 1 is normal influence, 0.5 is half
# Initialise variables
set signals to vector 0,...,0 of length of y; # Initialise signal results
set filteredY to y(1,...,lag) # Initialise filtered series
set avgFilter to null; # Initialise average filter
set stdFilter to null; # Initialise std. filter
set avgFilter(lag) to mean(y(1,...,lag)); # Initialise first value
set stdFilter(lag) to std(y(1,...,lag)); # Initialise first value
for i=lag+1,...,t do
if absolute(y(i) - avgFilter(i-1)) > threshold*stdFilter(i-1) then
if y(i) > avgFilter(i-1)
set signals(i) to +1; # Positive signal
else
set signals(i) to -1; # Negative signal
end
# Adjust the filters
set filteredY(i) to influence*y(i) + (1-influence)*filteredY(i-1);
set avgFilter(i) to mean(filteredY(i-lag,i),lag);
set stdFilter(i) to std(filteredY(i-lag,i),lag);
else
set signals(i) to 0; # No signal
# Adjust the filters
set filteredY(i) to y(i);
set avgFilter(i) to mean(filteredY(i-lag,i),lag);
set stdFilter(i) to std(filteredY(i-lag,i),lag);
end
end
演示
哇,感謝所有這些信息。我一直在幹這些策略(很糟糕),並解決了我的峯值檢測問題,但我將更好地考慮第1點。感謝這些大量的數據。 約翰。 – 2010-01-26 10:19:33