2014-12-04 49 views
1

我使用arima()和R的auto.arima()獲得銷售的預測。這些數據在三週的時間裏處於一週水平。錯誤的R華宇:太少非缺失觀測

我的代碼如下所示:

x<-c(1571,1501,895,1335,2306,930,2850,1380,975,1080,990,765,615,585,838,555,1449,615,705,465,165,630,330,825,555,720,615,360,765,1080,825,525,885,507,884,1230,342,615,1161, 1585,723,390,690,993,1025,1515,903,990,1510,1638,1461.67,1082,1075,2315,1014,2140,1572,794,1363,1184,1248,1344,1056,816,720,896,608,624,560,512,304,640,640,704,1072,768, 816,640,272,1168,736,1003,864,658.67,768,841,1727,944,848,432,704,850.67,1205,592,1104,976,629,814,1626,933.33,1100.33,1730,2742,1552,1038,826,1888,1440,1372,824,1824,1392,1424,768,464, 960,320,384,512,478,1488,384,338.67,176,624,464,528,592,288,544,418.67,336,752,400,1232,477.67,416,810.67,1256,1040,823,240,1422,704,718,1193,1541,1008,640,752, 1008,864,1507,4123,2176,899,1717,935)

length_data<-length(x)

length_train<-round(length_data*0.80)

forecast_period<-length_data-length_train

train_data<-x[1:length_train]

train_data<-ts(train_data,frequency=52,start=c(1,1))

validation_data<-x[(length_train+1):length_data]

validation_data<-ts(validation_data,frequency=52,start=c(ceiling((length_train)/52),((length_train)%%52+1)))

arima_output<-auto.arima(train_data) # fit the ARIMA Model

arima_validate <- Arima(x=validation_data,model=arima_output)

錯誤:

Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean, :

too few non-missing observations

我做錯了嗎? 什麼是「太少非缺失意見」是什麼意思?我現在搜索了它,但沒有得到任何更好的解釋。

感謝任何形式的幫助!

+0

你讀過的評論[您的文章在交叉驗證(http://stats.stackexchange.com/questions/126624/error-in -arima-的-R太爲數不多的非缺失觀測)? – 2014-12-04 16:00:31

+0

我修改了這個問題。我希望我已經涵蓋了所需的/相關的信息。讓我知道如果我失去了一些東西。 – Arushi 2014-12-04 18:19:01

回答

1

arima_output季節性 ARIMA模型:

> arima_output 
Series: train_data 
ARIMA(1,0,1)(0,1,0)[52] 

Arima()然後嘗試這個特定模型改裝到validation_data。但是,以適應季節模型對於時間序列,你需要觀察至少滿一年,因爲季節性ARIMA取決於季節性差異。

作爲例證,指出Arima()將愉快地和沒有錯誤改裝時間序列是雙只要validation_data

處理這個問題
validation_data <- x[(length_train+1):length_data] 
validation_data<-ts(rep(validation_data,2),frequency=52, 
    start=c(ceiling((length_train)/52),((length_train)%%52+1))) 
arima_validate <- Arima(x=validation_data,model=arima_output) 

一種方法是強制auto.arima()使用非季節性模型,通過指定D=0

validation_data <- x[(length_train+1):length_data] 
validation_data<-ts(validation_data,frequency=52, 
    start=c(ceiling((length_train)/52),((length_train)%%52+1))) 
arima_output<-auto.arima(train_data, D=0) # fit the ARIMA Model 
arima_validate <- Arima(x=validation_data,model=arima_output) 

所以這練得更多的是交叉驗證問題...

+0

謝謝Stephan ...從下次我會照顧到這一點,並感謝您的迴應... – Arushi 2014-12-05 06:36:30

1

你選擇的模型是ARIMA(1,0,1)(0,1,0)[52]。也就是說,它有一個季節性差異52.您的驗證數據有32個觀測值。因此,如果不知道培訓數據是什麼,你就不能在驗證數據上得出季節性差異。

解決此問題的一種方法是將模型擬合到全部時間序列,然後提取您想要的內容(推測可能來自驗證部分的殘差)。

您還可以提高代碼的可讀性:

x <- ts(x, frequency=52, start=c(1,1)) 
length_data <- length(x) 
length_train <- round(length_data*0.80) 
train_data <- ts(head(x, length_train), 
        frequency=frequency(x), start=start(x)) 
validation_data <- ts(tail(x, length_data-length_train), 
        frequency=frequency(x), end=end(x)) 

library(forecast) 
arima_train <- auto.arima(train_data) 
arima_full <- Arima(x, model=arima_train) 
res <- window(residuals(arima_full), start=start(validation_data)) 
+0

非常感謝!..這是有幫助的... ... – Arushi 2014-12-05 06:35:38