2016-11-26 50 views
1

我想在我的數據集上運行boruta功能選擇。R boruta包 - (列表)對象不能被強制鍵入'雙'

的代碼如下:

df<-read.csv('F:/DataAnalyticsClub/DACaseComp/DatasetDist/Datasets/BestFile.csv',stringsAsFactors=FALSE) 
install.packages("Boruta") 
library(Boruta) 
df[is.na(df)] <- 0 
df[df == ""] <- 0 
X<-df[ , -which(names(df) %in% c("PREVSALEDATE","PREVSALEDATE2","ClassLabel", "PARID", "PROPERTYUNIT", "PriceDiff1", "PriceDiff2", "DateDiff1", "DateDiff2", "SALEDATE"))] 
Y<-df['ClassLabel'] 



factorCols <- c("SCHOOLDESC","MUNIDESC","SALEDESC","INSTRTYPDESC","NEIGHDESC","TAXDESC","TAXSUBCODE_DESC","OWNERDESC","USEDESC","LOTAREA","CLEANGREEN","FARMSTEADFLAG","ABATEMENTFLAG","COUNTYEXEMPTBLDG","STYLEDESC","EXTFINISH_DESC","ROOFDESC","BASEMENTDESC","GRADEDESC","CONDITIONDESC","CDUDESC","HEATINGCOOLINGDESC","BSMTGARAGE") 
nonFactorCols<-c("PRICE","COUNTYTOTAL","LOCALTOTAL","FAIRMARKETTOTAL","STORIES","YEARBLT","TOTALROOMS","BEDROOMS","FULLBATHS","HALFBATHS","FIREPLACES","FINISHEDLIVINGAREA","PREVSALEPRICE","PREVSALEPRICE2") 

X[factorCols] <- lapply(X[factorCols], factor) 

set.seed(123) 
boruta.train<-Boruta(X,Y) 

所以你看,我有不同的特徵數據集,其中有些是串特點,所以我將它們轉換爲因素。其餘的是數字。測試我的假設: enter image description here 一旦我運行Boruta酒店我得到

Error in data.matrix(data.selected) : 
    (list) object cannot be coerced to type 'double' 

我不知道爲什麼。我所有的列都是因子或varoius數字類型。什麼可能是錯誤的?

google搜索了一下後,我發現,有些人建議做as.matrix()轉換,但在這種情況下:

> boruta.train<-Boruta(as.matrix(X),as.matrix(Y)) 
Error: Variable none not found. Ranger will EXIT now. 
Error in ranger::ranger(data = x, dependent.variable.name = "shadow.Boruta.decision", : 
    User interrupt or internal error. 

回答

0

好,與玩弄後,我設法找出問題所在。 Boruta要求Y(目標)是列表類型,而不是數據框或其他任何東西。

所以就像這樣創建Y:

Y<-df[,'ClassLabel'] 

解決了這個問題。

相關問題