4
任何人都可以使用randomForest和bigmemory庫設置分類(而不是迴歸)。我知道不能使用'公式方法',我們不得不求助於'x =預測器,y =響應方法'。看起來大內存庫不能處理具有分類值的響應向量它是一個矩陣,畢竟)在我的情況下,我有兩個級別,都表示爲字符R:使用大內存庫進行randomForest分類
根據bigmemory文檔...「一個數據幀將字符向量轉換爲因子,然後所有因素轉換爲數字因子水平」
任何建議的解決辦法得到隨機森林分類與bigmemory工作?
#EXAMPLE to problem
library(randomForest)
library(bigmemory)
# Removing any extra objects from my workspace (just in case)
rm(list=ls())
#first small matrix
small.mat <- matrix(sample(0:1,5000,replace = TRUE),1000,5)
colnames(small.mat) <- paste("V",1:5,sep = "")
small.mat[,5] <- as.factor(small.mat[,5])
small.rf <- randomForest(V5 ~ .,data = small.mat, mtry=2, do.trace=100)
print(small.rf)
small.result <- matrix(0,1000,1)
small.result <- predict(small.rf, data=small.mat[,-5])
#now small dataframe Works!
small.mat <- matrix(sample(0:1,5000,replace = TRUE),1000,5)
colnames(small.mat) <- paste("V",1:5,sep = "")
small.data <- as.data.frame(small.mat)
small.data[,5] <- as.factor(small.data[,5])
small.rf <- randomForest(V5 ~ .,data = small.data, mtry=2, do.trace=100)
print(small.rf)
small.result <- matrix(0,1000,1)
small.result <- predict(small.rf, data=small.data[,-5])
#then big matrix Classification Does NOT Work :-(
#----------------****************************----
big.mat <- as.big.matrix(small.mat, type = "integer")
#Line below throws error, "cannot coerce class 'structure("big.matrix", package = "bigmemory")' into a data.frame"
big.rf <- randomForest(V5~.,data = big.mat, do.trace=10)
#Runs without error but only regression
big.rf <- randomForest(x = big.mat[,-5], y = big.mat[,5], mtry=2, do.trace=100)
print(big.rf)
big.result <- matrix(0,1000,1)
big.result <- predict(big.rf, data=big.mat[,-5])
脅迫因子通過'y = as.factor(big.mat [,5])'? – joran 2012-04-29 05:58:30
我應該補充說,我不知道'randomForest'實際上是否支持big.matrix輸入,當對象對於內存來說確實太大時。 – joran 2012-04-29 06:02:08
據我所知,'randomForest'在調用模型構建時將所有'bigmemory'數據加載到RAM中。 – DrDom 2012-04-29 06:25:49