R - 預測命令錯誤「未定義的列選擇」

我是R的新手，並且遇到R預測命令有問題。我收到此錯誤R - 預測命令錯誤「未定義的列選擇」

Error in `[.data.frame`(newdata, , as.character(object$formula[[2]])) : 
    undefined columns selected

當我執行這個命令：

model.predict <- predict.boosting(model,newdata=test)

這裏是我的模型：

model <- boosting(Y~x1+x2+x3+x4+x5+x6+x7, data=train)

這裏是我的測試數據的結構： STR（測試）

'data.frame': 343 obs. of 7 variables: 
$ x1: Factor w/ 4 levels "Americas","Asia_Pac",..: 4 2 4 2 4 3 3 3 4 1 ... 
$ x2: Factor w/ 5 levels "Fifth","First",..: 3 3 2 2 4 2 4 4 1 1 ... 
$ x3: Factor w/ 3 levels "Best","Better",..: 2 3 1 1 3 2 2 1 3 3 ... 
$ x4: Factor w/ 2 levels "Female","Male": 1 1 2 1 1 2 1 2 2 2 ... 
$ x5: int 82 55 47 31 6 53 77 68 76 86 ... 
$ x6: num 22.8 14.6 25.5 38.3 7.9 32.8 4.6 34.2 36.7 21.7 ... 
$ x7: num 0.679 0.925 0.897 0.684 0.195 ...

而且我的訓練數據的結構：

$ RecordID: int 1 2 3 4 5 6 7 8 9 10 ... 
$ x1  : Factor w/ 4 levels "Americas","Asia_Pac",..: 1 2 2 3 1 1 1 2 2 4 ... 
$ x2  : Factor w/ 5 levels "Fifth","First",..: 5 5 3 2 5 5 5 4 3 2 ... 
$ x3  : Factor w/ 3 levels "Best","Better",..: 2 3 2 2 3 1 2 3 1 1 ... 
$ x4  : Factor w/ 2 levels "Female","Male": 1 2 2 2 1 1 2 2 1 1 ... 
$ x5  : int 1 67 75 51 84 33 21 80 48 5 ... 
$ x6  : num 21 13.8 30.3 11.9 1.7 13.2 33.9 17 3.4 19.5 ... 
$ x7  : num 0.35 0.85 0.73 0.39 0.47 0.13 0.2 0.12 0.64 0.11 ... 
$ Y  : Factor w/ 2 levels "Green","Yellow": 2 2 1 2 2 2 1 2 2 2 ..

我覺得有與測試數據的結構有問題，但我不能找到它，或者我有一個錯誤的認識，以結構「預測」命令。請注意，如果我在訓練數據上運行預測命令，它將起作用。任何關於去哪裏看的建議？

謝謝！

來源

2012-12-16 user1907117

測試數據還需要Y變量 – MattBagg

predict.boosting()預計會給出測試數據的實際標籤，因此它可以計算它的效果（如下面所示的混淆矩陣）。

library(adabag) 

data(iris) 

iris.adaboost <- boosting(Species~Sepal.Length+Sepal.Width+Petal.Length+ 
     Petal.Width, data=iris, boos=TRUE, mfinal=10) 

# make a 'test' dataframe without the classes, as in the question 
iris2 <- iris 
iris2$Species <- NULL 

# replicates the error 
irispred=predict.boosting(iris.adaboost, newdata=iris2) 
#Error in `[.data.frame`(newdata, , as.character(object$formula[[2]])) : 
# undefined columns selected

這裏的工作示例，從幫助文件主要放在了剛剛所以這裏有一個工作示例（並證明混淆矩陣）。

# first create subsets of iris data for training and testing 
sub <- c(sample(1:50, 25), sample(51:100, 25), sample(101:150, 25)) 
iris3 <- iris[sub,] 
iris4 <- iris[-sub,] 

iris.adaboost <- boosting(Species ~ ., data=iris3, mfinal=10) 

# works 
iris.predboosting<- predict.boosting(iris.adaboost, newdata=iris4) 

iris.predboosting$confusion 
#    Observed Class 
#Predicted Class setosa versicolor virginica 
#  setosa   50   0   0 
#  versicolor  0   50   0 
#  virginica  0   0  50

來源

2012-12-16 01:08:00 MattBagg

謝謝。當我添加測試變量Y時，我收到錯誤> test < - read.csv（「test.csv」，header = TRUE） > predict.test < - predict.boosting（model，newdata = test） Error in矩陣（unlist（value，recursive = FALSE，use.names = FALSE），nrow = nr，： 'dimnames'[2]的長度不等於數組範圍 – user1907117

這似乎是一個不同的錯誤。理想情況下，你應該'輸入'足夠的火車和測試數據，以便其他人可以得到相同的錯誤，但是我們可以從'dput（test [1:20，]）開始'並且'dput（head（train [1:20，]））'。如果你運行這些命令並編輯你的問題以包含它們難看的輸出，那將會有所幫助 – MattBagg

或者，如果你同意這是一個不同的錯誤，單獨的問題。 – MattBagg

當你的y是因素，顯示此錯誤，請嘗試as.vector(y)~.

來源

2013-06-19 06:16:28 funnng

，你用它來預測應該是完全一樣的訓練數據的列名數據的列名。

來源

2018-01-18 09:27:33

R - 預測命令錯誤「未定義的列選擇」

回答

相關問題