2015-01-09 42 views
3

我是R新手,嘗試從文本檢索數據,然後將其應用於SVM進行分類。下面是代碼:來自SVM中的錯誤R

train<-read.table("training.txt") 
train[which(train=="?",arr.ind=TRUE)]<-NA 
train=unique(train) 
y=train[,length(train)] 

classifier<-svm(y~.,data=train[,-length(train)],scale=F) 
classifier<-svm(x=train[,-length(train)],y=factor(y),scale=F) 

我嘗試了兩種不同的方式來調用SVM,爲一日一(svm(y~.,data=train[,-length(train)],scale=F))似乎不錯,但第二人有問題,報告說:

Error in svm.default(x = train[, length(train)], y = factor(y), scale = F) : 
    NA/NaN/Inf in foreign function call (arg 1) 
In addition: Warning message: 
In svm.default(x = train[, length(train)], y = factor(y), scale = F) : 
    NAs introduced by coercion 

這裏是一個在training.txt的樣本,最後一列是目標

39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,0 
50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,0 
38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,0 
53,Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,Black,Male,0,0,40,United-States,0 
28,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,0 
37,Private,284582,Masters,14,Married-civ-spouse,Exec-managerial,Wife,White,Female,0,0,40,United-States,0 
49,Private,160187,9th,5,Married-spouse-absent,Other-service,Not-in-family,Black,Female,0,0,16,Jamaica,0 
52,Self-emp-not-inc,209642,HS-grad,9,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,45,United-States,1 
31,Private,45781,Masters,14,Never-married,Prof-specialty,Not-in-family,White,Female,14084,0,50,United-States,1 
42,Private,159449,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,5178,0,40,United-States,1 
37,Private,280464,Some-college,10,Married-civ-spouse,Exec-managerial,Husband,Black,Male,0,0,80,United-States,1 
30,State-gov,141297,Bachelors,13,Married-civ-spouse,Prof-specialty,Husband,Asian-Pac-Islander,Male,0,0,40,India,1 
23,Private,122272,Bachelors,13,Never-married,Adm-clerical,Own-child,White,Female,0,0,30,United-States,0 
32,Private,205019,Assoc-acdm,12,Never-married,Sales,Not-in-family,Black,Male,0,0,50,United-States,0 
40,Private,121772,Assoc-voc,11,Married-civ-spouse,Craft-repair,Husband,Asian-Pac-Islander,Male,0,0,40,NA,1 

任何想法?提前致謝!

回答

4

從技術文檔:

對於x說法:

a data matrix, a vector, or a sparse matrix (object of class Matrix 
provided by the Matrix package,or of class matrix.csr provided by the 
SparseM package, or of class simple_triplet_matrix provided by the slam package). 

對於y說法:

a response vector with one label for each row/component of x. Can be 
either a factor (for classification tasks) or a numeric vector (for regression). 

如果鍵入:x=train[,-length(train)]在第二個功能你實際上是使用data.frame這是不支持,它崩潰。

svm功能可與數字矩陣只有

library(e1071) 
train[which(train=="?",arr.ind=TRUE)]<-NA 
train=unique(train) 
y=factor(train[,length(train)]) 
train <- data.frame(lapply(train,as.numeric)) #convert to numeric. factors are integer fields anyway behind the scenes. 

train <- as.matrix(train[-length(train)]) 

classifier<-svm(x= train ,y=y,scale=F) 

輸出:

> summary(classifier) 

Call: 
svm.default(x = train, y = y, scale = F) 


Parameters: 
    SVM-Type: C-classification 
SVM-Kernel: radial 
     cost: 1 
     gamma: 0.07142857 

Number of Support Vectors: 14 

(9 5) 


Number of Classes: 2 

Levels: 
0 1 
+0

您好,感謝您的回覆。我嘗試了,但仍然得到相同的錯誤分類器<-svm(x = as.matrix(train [, - length(train)]),y = factor(y),scale = F) svm.default錯誤(x = as.matrix(train [,-length(train)]),y = factor(y),: NA/NaN/Inf在外部函數調用中(arg 1) 另外:警告消息: 在svm .default(x = as.matrix(train [,-length(train)]),y = factor(y),: 強制引入NAAs – Lei

+0

您還需要將字段轉換爲數字。 – LyzandeR