2017-10-10 105 views
-3

我想使用預測函數來預測邏輯迴歸的值,我得到的行數不正確。此問題已被詢問 R Warning: newdata' had 15 rows but variables found have 22 rows邏輯迴歸R預測:沒有行錯誤

我試過了這個方法,但我仍然得到錯誤。下面是代碼

# Split as training and test sets 
train_idx <- trainTestSplit(adult,trainPercent=75,seed=1111) 
train <- adult[train_idx, ] 
test <- adult[-train_idx, ] 


xtrain <- train[,1:7] 
ytrain <- train[,8] 
xtrain1 <- dummy.data.frame(xtrain, sep = ".") 
xtrain2 <- as.matrix(xtrain1) 

xtest <- test[,1:7] 
ytest <- test[,8] 
xtest1 <- dummy.data.frame(xtest, sep = ".") 
xtest2 <- as.matrix(xtest1) 

fit=glm(ytrain~xtrain2,family=binomial) 
a=predict(fit,newdata=xtrain1,type="response") 
b=ifelse(a>0.5,1,0) 
confusionMatrix(b,ytrain) 
Confusion Matrix and Statistics 

      Reference 
Prediction  0  1 
     0 16065 3157 
     1 968 2430 

       Accuracy : 0.8176   
       95% CI : (0.8125, 0.8227) 
# Predict with test dataframe 
a=predict(fit,xtest1,type="response") 

: 'newdata' had 7541 rows but variables found have 22620 rows 
2: In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type == : 
    prediction from a rank-deficient fit may be misleading 
> 

我也試過

names(xtest1)=names(xtrain1) and 
    a=predict(fit,xtest1,type="response") 

他們是相同的,但無論如何,我得到了同樣的錯誤。這是一個非常直觀的問題。請幫助...

+0

請參閱本·Bolker第一個註釋:HTTPS:/ /stackoverflow.com/questions/9028662/predict-maybe-im-not-understanding-it – Alex

+0

謝謝@亞歷克斯。做出改變! –

回答

0

我改變了適合使用「數據」,而不是一個矩陣和Y列,現在它的工作原理

adult1 <- dummy.data.frame(adult, sep = ".") 

train_idx <- trainTestSplit(adult1,trainPercent=75,seed=1111) 
train <- adult1[train_idx, ] 
test <- adult1[-train_idx, ] 

fit=glm(salary~.,family=binomial,data=train) 
a=predict(fit,newdata=train,type="response") 
b=ifelse(a>0.5,1,0) 
confusionMatrix(b,train$salary) 


m=predict(fit,newdata=test,type="response")