2017-05-16 85 views
0

然而,我的問題與this thread完全相同,因爲這似乎還沒有令人滿意的答案,所以我認爲再次詢問以及可重複的代碼。當用SVM模型預測時,R返回因子(0)

training <- read.csv("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-training.csv")[,-1] 
testing <- read.csv("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-testing.csv")[,-1] 
# Importing data 

library(e1071) 
# Load the required package for SVM 

svm_model <- svm(classe ~ pitch_arm + pitch_forearm + pitch_dumbbell + pitch_belt + 
    roll_arm + roll_forearm + roll_dumbbell + roll_belt + 
    yaw_arm + yaw_forearm + yaw_dumbbell + yaw_belt, 
    data = training, scale = FALSE, cross = 10) 
# Perform SVM analysis with default gamma and cost, and do 10-fold cross validation 

predict(svm_model, testing) 
# R returns factor(0) here 

我已檢查測試數據框是否具有所需的所有列,並且這些所需列中不存在NA。請給我一些想法繼續。謝謝!

+0

是的,它們屬於同一類型。感謝您的提醒! :) – ytu

回答

0

這似乎是e1071 predict.svm函數中的一個怪癖的結果。儘管您的測試數據沒有模型中變量的缺失值。每個點都有缺失的值。

complete.cases(testing) 
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
[14] FALSE FALSE FALSE FALSE FALSE FALSE FALSE 

您可以通過消除不需要的變量來解決此問題。

ModelVars = which(names(training) %in% 
    c("pitch_arm", "pitch_forearm", "pitch_dumbbell", "pitch_belt", 
    "roll_arm", "roll_forearm", "roll_dumbbell", "roll_belt", 
    "yaw_arm", "yaw_forearm", "yaw_dumbbell", "yaw_belt")) 
test2 = testing[, ModelVars] 

predict(svm_model, test2) 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
A A B A A A D B A A A C A A A A A A A A 
Levels: A B C D E 
+0

此方法適用於我,謝謝! :) – ytu