如何從模型名稱中提取模型信息？

-1

我有充足的時間序列模型名稱如下一個矢量，考慮載體的名稱模式：如何從模型名稱中提取模型信息？

[1] "ARIMA(2,1,0) with drift" "ARIMA(2,0,0) with non-zero mean" "ARIMA(2,0,0) with non-zero mean" "ARIMA(2,0,0) with non-zero mean" "ARIMA(0,0,1) with non-zero mean"

這些載體包括五個不同的部分：

1）型號名稱：圓括號之前總是有一個模型名稱，在這種情況下，「ARIMA」是一個模型名稱（ARIMA是一種預測技術，該技術完全基於其自身的慣性投影一系列的未來值，簡寫爲自迴歸積分移動平均值）

2）自迴歸部分（AR部分被稱爲「p」）：逗號前的括號之後的第一個數字是自迴歸部分，因此，例如，如上所示的這些向量具有值2,2,2， 2,0爲AR部分。

3）移動平均部分（簡稱爲「d」）：第一個逗號被稱爲移動平均部分後右括號中的第二元件。在這個例子中，我已1,0,0,0,0作爲移動平均

4）的差分部分（稱爲「Q」）：括號中的最後一個元素是大多稱爲求差部分作爲術語中的「q」。在這個例子中，我有0,0,0,0,1作爲值。

5）「with」之後的另兩個部分是漂移部分和非零部分。

問題是我需要從模型向量中提取這些元素。

通過觀察模型的名字，我想編寫一個程序來提取如下：

1. Name of the model eg: ARIMA 
2. Number of AR coefficients 
3. Number of MA coefficients 
4. Order of differencing 
5. Whether the model has a drift or not 
6. whether it has a zero mean or not

我的輸出應該是這樣的：

Model p d q outcome_with_drift outcome_with_non_zero_mean 
1 ARIMA 2 1 0     1       0 
2 ARIMA 2 0 0     0       1 
3 ARIMA 2 0 0     0       1 
4 ARIMA 2 0 0     0       1 
5 ARIMA 0 0 1     0       1

來源

2017-04-25 user7892705

您可以使用library(stringr)提取矢量例如，如果vect是具有以下輸入的向量：

vect <- c("ARIMA(2,1,0) with drift", "ARIMA(2,0,0) with non-zero mean" ,"ARIMA(2,0,0) with non-zero mean" , 
      "ARIMA(2,0,0) with non-zero mean" ,"ARIMA(0,0,1) with non-zero mean")

然後用str_split_fixed將它解壓到單獨的列如下：

library(stringr) 

df <- data.frame(str_split_fixed(vect,"\\s|\\(|\\)|,",n=5)) 
###Here we have choosen the separator as space(\\s), parenthesis (\\(and \\)) and commas (,) 

names(df) <- c("Model","p","d","q","outcome") 
#Rename basis the question, into follwing: 
#p is the number of autoregressive terms(AR) 
#d is the number of nonseasonal differences needed for stationarity(MA) 
#q is the number of lagged forecast errors in the prediction equation(order of differencing) 

df$outcome_ <- gsub("\\s|-","_",trimws(df$outcome)) 
#cleaning the outcome column by replacing spaces and dashes with underscores 
dummy_mat <- data.frame(model.matrix(~outcome_-1,data=df)) 
#using model.matrix to calculate the dummies for drift and non zero mean, for the value of 1 meaning True and 0 meaning False 
df_final <- data.frame(df[,1:4],dummy_mat)

結果：

# Model p d q outcome_with_drift outcome_with_non_zero_mean 
# 1 ARIMA 2 1 0     1       0 
# 2 ARIMA 2 0 0     0       1 
# 3 ARIMA 2 0 0     0       1 
# 4 ARIMA 2 0 0     0       1 
# 5 ARIMA 0 0 1     0       1

來源

2017-04-25 08:16:27 PKumar

謝謝，這是我想要的！ – user7892705

如何從模型名稱中提取模型信息？

回答

相關問題