1
我想在我的數據集上運行boruta功能選擇。R boruta包 - (列表)對象不能被強制鍵入'雙'
的代碼如下:
df<-read.csv('F:/DataAnalyticsClub/DACaseComp/DatasetDist/Datasets/BestFile.csv',stringsAsFactors=FALSE)
install.packages("Boruta")
library(Boruta)
df[is.na(df)] <- 0
df[df == ""] <- 0
X<-df[ , -which(names(df) %in% c("PREVSALEDATE","PREVSALEDATE2","ClassLabel", "PARID", "PROPERTYUNIT", "PriceDiff1", "PriceDiff2", "DateDiff1", "DateDiff2", "SALEDATE"))]
Y<-df['ClassLabel']
factorCols <- c("SCHOOLDESC","MUNIDESC","SALEDESC","INSTRTYPDESC","NEIGHDESC","TAXDESC","TAXSUBCODE_DESC","OWNERDESC","USEDESC","LOTAREA","CLEANGREEN","FARMSTEADFLAG","ABATEMENTFLAG","COUNTYEXEMPTBLDG","STYLEDESC","EXTFINISH_DESC","ROOFDESC","BASEMENTDESC","GRADEDESC","CONDITIONDESC","CDUDESC","HEATINGCOOLINGDESC","BSMTGARAGE")
nonFactorCols<-c("PRICE","COUNTYTOTAL","LOCALTOTAL","FAIRMARKETTOTAL","STORIES","YEARBLT","TOTALROOMS","BEDROOMS","FULLBATHS","HALFBATHS","FIREPLACES","FINISHEDLIVINGAREA","PREVSALEPRICE","PREVSALEPRICE2")
X[factorCols] <- lapply(X[factorCols], factor)
set.seed(123)
boruta.train<-Boruta(X,Y)
所以你看,我有不同的特徵數據集,其中有些是串特點,所以我將它們轉換爲因素。其餘的是數字。測試我的假設: 一旦我運行Boruta酒店我得到
Error in data.matrix(data.selected) :
(list) object cannot be coerced to type 'double'
我不知道爲什麼。我所有的列都是因子或varoius數字類型。什麼可能是錯誤的?
google搜索了一下後,我發現,有些人建議做as.matrix()轉換,但在這種情況下:
> boruta.train<-Boruta(as.matrix(X),as.matrix(Y))
Error: Variable none not found. Ranger will EXIT now.
Error in ranger::ranger(data = x, dependent.variable.name = "shadow.Boruta.decision", :
User interrupt or internal error.