我有下述R腳本設置,其被設計使用插入符包建立從數據幀的模型:如何格式化使用rpy2的Python腳本以構建帶有R-caret功能的模型?
library(caret)
library(broom)
data<- data.table("mydata.csv")
splitprob <- 0.8
traintestindex <- createDataPartition(data$fluorescence, p=splitprob, list=F)
testset <- data[-traintestindex,]
trainingset <- data[traintestindex,]
model <- train(fluorescence~., trainingset, method = "glmStepAIC", preProc = c("center","scale"), trControl = cvCtrl)
final_model<- tidy(model$finalModel)
write.csv(tidy, "model_glm.csv")
我想能夠有這樣的代碼的功能內的被表示Python腳本。在生成一個熊貓數據框之後,它將被轉換成一個R數據框,並隨後運行插入符的列車函數,該函數的設置與上面的R腳本中的參數相同。
import pandas as pd
from rpy2.robjects import r
import sys
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
from rpy2.robjects import r, pandas2ri
pandas2ri.activate()
caret = rpackages.importr('caret')
broom= rpackages.importr('broom')
my_data= pd.read_csv("my_data.csv")
r_dataframe= pandas2ri.py2ri(my_data)
preprocessing= ["center", "scale"]
center_scale= StrVector(preprocessing)
cvCtrl = caret.trainControl(method = "repeatedcv", number= 20, repeats = 100)
model_R= caret.train("fluorescence~.", data= r_dataframe, method = "glmStepAIC", preProc = center_scale, trControl = cvCtrl)
print(model_R.finalModel)
然而,這個腳本明顯未正確配置,因爲我試圖在該行model_R= caret.train("fluorescence~., r_dataframe, method = "glmStepAIC", preProc = center_scale, trControl = cvCtrl")
運行與rpy2產量SyntaxError: invalid syntax
的Python腳本。我試圖遵循文檔中給出的語法(來源:https://rpy2.github.io/doc/latest/html/introduction.html?highlight=linear%20model),但是設置這種代碼的方式很稀疏。
爲了讓代碼正常工作,我的Python代碼中必須修復哪些內容才能從我的數據框中構建模型?