2016-12-15 139 views
1

我想在python中實現相當於auto.arima()函數R在Python中R的auto.arima()相當於

In R auto.arima函數將時間序列值作爲輸入計算ARIMA階參數(p,d,q值)並擬合模型,因此不需要提供p,d,q值作爲用戶的輸入。

我想在python(不調用auto.arima R)中使用相當於auto.arima函數來預測時間序列中的未來值。在下文中的時間序列執行auto.arima-蟒爲40點和預測下一個6個值,然後由1點移動窗口並再次執行相同的過程。

以下是示例性數據:

value 
0 
2.584751 
2.884758 
2.646735 
2.882105 
3.267503 
3.94552 
4.70788 
5.384803 
54.77972 
62.87139 
78.68957 
112.7166 
155.0074 
170.8084 
196.1941 
237.4928 
254.9718 
175.0717 
217.3807 
244.7357 
274.4517 
304.6838 
373.3202 
345.6252 
461.2653 
443.5982 
472.3653 
469.3326 
506.8819 
532.1639 
542.2837 
514.9269 
528.0194 
540.539 
542.7031 
556.8262 
569.7132 
576.2339 
577.7212 
577.0873 
569.6199 
573.2445 
573.7825 
589.3506 

我試圖寫函數來計算使用AD富勒測試差分的順序,通過有區別的時間序列(其差分原始時間序列作爲每adfuller試驗後變爲靜止結果)轉換爲ARMA順序選擇函數來計算P,Q順序值。

此外使用這些值來傳遞給在Statsmodels華宇功能。但功能似乎不起作用。

import numpy as np 
import pandas as pd 
import statsmodels.api as sm 
from statsmodels.tsa.stattools import adfuller 
from statsmodels.tsa.stattools import acf, pacf 

def diff_terms(timeseries): 
    i=1 
    j=0 
    while i != 0: 
     dftest = adfuller(timeseries, autolag='AIC') 
     if dftest[0] <= dftest[4]["5%"]: 
      i = 0 
     else: 
      timeseries = np.diff(timeseries) 
      i = 1 
      j = j + 1 
    return j 

def p_q_values_estimator(timeseries): 
    p=0 
    q=0 
    lag_acf = acf(timeseries, nlags=20) 
    lag_pacf = pacf(timeseries, nlags=20, method='ols') 
    y=1.96/np.sqrt(len(timeseries)) 

    if lag_acf[0] < y: 
     for a in lag_acf: 
      if a < y: 
       q = q + 1 
       break 
    elif lag_acf[0] > y: 
     for c in lag_acf: 
      if c > y: 
       q = q + 1 
       break 

    if lag_pacf[0] < y: 
     for b in lag_pacf: 
      if b < y: 
       p = p + 1 
       break 
    elif lag_pacf[0] > y: 
     for d in lag_pacf: 
      if d > y: 
       p = p + 1 
       break 

    p_q=[p,q] 
    return(p_q) 

def p_q_values_estimator2(timeseries): 
    res = sm.tsa.arma_order_select_ic(timeseries, ic=['aic'], max_ar=5, max_ma=4,trend='nc') 
    return res.aic_min_order 

data1=[] 
data=pd.read_csv('ABC.csv') 
d_value=diff_terms(data.value) 
data1[:]=data[:] 
data = data[0:40] 

i=0 
while i < d_value: 
    data_diff = np.diff(data) 
    i = i+1 

p_q_values=p_q_values_estimator(data) 
p_value=p_q_values[0] 
q_value=p_q_values[1] 

p_q_values2=p_q_values_estimator2(data_diff) 
p_value2=p_q_values2[0] 
q_value2=p_q_values2[1] 


exogx = np.array(range(0,40)) 
fit2 = sm.tsa.ARIMA(np.array(data), (p_value, d_value, q_value), exog = exogx).fit() 
print(fit2.fittedvalues) 
pred2 = fit2.predict(start = 40, end = 45, exog = np.array(range(40,46))) 
print(pred2) 
plt.plot(fit2.fittedvalues) 
plt.plot(np.array(data)) 
plt.plot(range(40,45), np.array(pred2)) 
plt.show() 

錯誤 - 使用ARMA爲了選擇

p_q_values2=p_q_values_estimator2(data_diff) 
line 56, in p_q_values_estimator2 
res = sm.tsa.arma_order_select_ic(timeseries, ic=['aic'], max_ar=5, max_ma=4,trend='nc') 
File "C:\Python27\lib\site-packages\statsmodels\tsa\stattools.py", line 1052, in arma_order_select_ic min_res.update({i + '_min_order' : (mins[0][0], mins[1][0])}) 
IndexError: index 0 is out of bounds for axis 0 with size 0 

錯誤 - 在使用基於ACF PACF函數P的計算,Q爲了

fit2 = sm.tsa.ARIMA(np.array(data), (p_value, d_value, q_value), exog = exogx).fit() 
File "C:\Python27\lib\site-packages\statsmodels\tsa\arima_model.py", line 1104, in fit 
callback, **kwargs) 
File "C:\Python27\lib\site-packages\statsmodels\tsa\arima_model.py", line 942, in fit 
armafit.mle_retvals = mlefit.mle_retvals 
AttributeError: 'LikelihoodModelResults' object has no attribute 'mle_retvals' 
+0

你見過這個:[auto.arima()相當於python](http://stackoverflow.com/questions/22770352/auto-arima-equivalent-for-python) –

+0

是的,但即使這種方法導致相同的錯誤。 AttributeError:'LikelihoodModelResults'對象沒有屬性'mle_retvals'。 – user245204

回答

0

瓦爾斯是我自己的事情,但你可以用pd.date_range創建自己的索引

rdata=ts(traindf.requests_per_active.values,frequency=12) 
#forecasts 
fit=forecast.auto_arima(rdata) 
forecast_output=forecast.forecast(fit,h=6,level=(95.0)) 
#convert forecasts to dataframe  
forecast_results=pd.Series(forecast_output[3], index=vals.index) 
lowerpi=pd.Series(forecast_output[4], index=vals.index) 
upperpi=pd.Series(forecast_output[5], index=vals.index) 
results = pd.DataFrame({'forecast' : forecast_results, 'lowerpi' : lowerpi, 'upperpi' : upperpi}) 
+0

預測模塊也是您自己的模塊?或者你從任何存儲庫下載? – user6608138

+0

我的不好 - 使用rpy2和導入器從Python導入「預測」包 – thedon

+0

感謝您的回覆......但我怎麼能使用auto_arima輸出的統計包在python中......當你在python中使用相同的R apis時,你有沒有觀察到RMSE的改進... – user6608138