2017-04-25 99 views
-1

我有充足的時間序列模型名稱如下一個矢量,考慮載體的名稱模式如何從模型名稱中提取模型信息?

[1] "ARIMA(2,1,0) with drift" "ARIMA(2,0,0) with non-zero mean" "ARIMA(2,0,0) with non-zero mean" "ARIMA(2,0,0) with non-zero mean" "ARIMA(0,0,1) with non-zero mean" 

這些載體包括五個不同的部分:

1)型號名稱:圓括號之前總是有一個模型名稱,在這種情況下,「ARIMA」是一個模型名稱(ARIMA是一種預測技術,該技術完全基於其自身的慣性投影一系列的未來值,簡寫爲自迴歸積分移動平均值)

2)自迴歸部分(AR部分被稱爲「p」):逗號前的括號之後的第一個數字是自迴歸部分,因此,例如,如上所示的這些向量具有值2,2,2, 2,0爲AR部分。

3)移動平均部分(簡稱爲「d」):第一個逗號被稱爲移動平均部分後右括號中的第二元件。 在這個例子中,我已1,0,0,0,0作爲移動平均

4)的差分部分(稱爲「Q」):括號中的最後一個元素是大多稱爲求差部分作爲術語中的「q」。 在這個例子中,我有0,0,0,0,1作爲值。

5)「with」之後的另兩個部分是漂移部分和非零部分。

問題是我需要從模型向量中提取這些元素。

通過觀察模型的名字,我想編寫一個程序來提取如下:

1. Name of the model eg: ARIMA 
2. Number of AR coefficients 
3. Number of MA coefficients 
4. Order of differencing 
5. Whether the model has a drift or not 
6. whether it has a zero mean or not 

我的輸出應該是這樣的:

Model p d q outcome_with_drift outcome_with_non_zero_mean 
1 ARIMA 2 1 0     1       0 
2 ARIMA 2 0 0     0       1 
3 ARIMA 2 0 0     0       1 
4 ARIMA 2 0 0     0       1 
5 ARIMA 0 0 1     0       1 

回答

1

您可以使用library(stringr)提取矢量例如,如果vect是具有以下輸入的向量:

vect <- c("ARIMA(2,1,0) with drift", "ARIMA(2,0,0) with non-zero mean" ,"ARIMA(2,0,0) with non-zero mean" , 
      "ARIMA(2,0,0) with non-zero mean" ,"ARIMA(0,0,1) with non-zero mean") 

然後用str_split_fixed將它解壓到單獨的列如下:

library(stringr) 

df <- data.frame(str_split_fixed(vect,"\\s|\\(|\\)|,",n=5)) 
###Here we have choosen the separator as space(\\s), parenthesis (\\(and \\)) and commas (,) 

names(df) <- c("Model","p","d","q","outcome") 
#Rename basis the question, into follwing: 
#p is the number of autoregressive terms(AR) 
#d is the number of nonseasonal differences needed for stationarity(MA) 
#q is the number of lagged forecast errors in the prediction equation(order of differencing) 

df$outcome_ <- gsub("\\s|-","_",trimws(df$outcome)) 
#cleaning the outcome column by replacing spaces and dashes with underscores 
dummy_mat <- data.frame(model.matrix(~outcome_-1,data=df)) 
#using model.matrix to calculate the dummies for drift and non zero mean, for the value of 1 meaning True and 0 meaning False 
df_final <- data.frame(df[,1:4],dummy_mat) 

結果

# Model p d q outcome_with_drift outcome_with_non_zero_mean 
# 1 ARIMA 2 1 0     1       0 
# 2 ARIMA 2 0 0     0       1 
# 3 ARIMA 2 0 0     0       1 
# 4 ARIMA 2 0 0     0       1 
# 5 ARIMA 0 0 1     0       1 
+0

謝謝,這是我想要的! – user7892705

相關問題