2016-03-02 26 views
0

我正在用do.call函數掙扎一點點。我正在數據集上運行多個不同的模型。do.call中正確的參數說明

對於每個模型,我希望在模型函數中傳遞一個指定模型,函數參數和數據集的公式。

我希望這是明確的,我想要做的,如果不是請留下評論,我試圖澄清我的問題。

我當前的代碼是有點長,所以這裏是一個僞玩具例子:

methods <- c('lm','glm',...) 
arguments <- list(list('Arguments lm '), list('Arguments glm '),...) 
models <- list(y ~. x1 + x2 , y ~. x1 + x3) 

for(i in 1:N) { 

current.model <- do.call(methods[i], ???) 

} 

回答

1

下面是一個例子

library(data.table) 
set.seed(123) 
dat <- data.table(x1=runif(10),x2=runif(10),x3=runif(10)) 
dat[,y:=x1+2*x2+3*x3+runif(10)] 

>  dat 
      x1   x2  x3  y 
1: 0.2875775 0.95683335 0.8895393 5.832886 
2: 0.7883051 0.45333416 0.6928034 4.675683 
3: 0.4089769 0.67757064 0.6405068 4.376344 
4: 0.8830174 0.57263340 0.9942698 5.806561 
5: 0.9404673 0.10292468 0.6557058 3.138048 
6: 0.0455565 0.89982497 0.7085305 4.448594 
7: 0.5281055 0.24608773 0.5440660 3.410939 
8: 0.8924190 0.04205953 0.5941420 2.975372 
9: 0.5514350 0.32792072 0.2891597 2.392937 
10: 0.4566147 0.95450365 0.1471136 3.038589 

我要修改您的結構有點使參數成爲明確的列表清單並命名內部列表中的每個元素以消除歧義。

methods <- c('lm','glm') 
arguments <- list(list(data=dat), list(data=dat,family="gaussian")) 
models <- list(y ~. x1 + x2 , y ~. x1 + x3) 

do.call需要函數和參數列表。所以我可以做一些形式do.call([an element of methods],[a list of arguments])。由於模型本身就是一個參數,因此我需要將它加入到您在arguments中提供的「附加」參數中。所以我會有一些對象,如c(list(models[[1]]),arguments[[1]])。第一個參數中的list將元素models[[1]]轉換爲列表,因爲arguments[[1]]使得c可以連接兩個相似的列表。最後,我可以在您的for循環中對這些函數調用do.call,但R-style更喜歡*apply函數。在這裏,我使用seq_along,它只是讓我1:length(methods)並應用一個匿名函數,它執行for循環的主體;總之這基本上是一個簡短的for循環,返回結果的列表:

lapply(seq_along(methods), function(n) 
    do.call(methods[n],c(list(models[[n]]),arguments[[n]]))) 

[[1]] 

Call: 
lm(formula = y ~ x1 + x2, data = structure(list(x1 = c(0.563014860032126, 
0.211994701065123, 0.174694777932018, 0.135693877004087, 0.460017150267959, 
0.736233349423856, 0.63450039527379, 0.652027820236981, 0.467176814330742, 
0.148995384806767), x2 = c(0.0307870297692716, 0.601646583992988, 
0.812958373920992, 0.698285705409944, 0.907962741097435, 0.75469194049947, 
0.0430496339686215, 0.0829190369695425, 0.109014765359461, 0.33699565846473 
), x3 = c(0.412113963160664, 0.432729347608984, 0.0741072639357299, 
0.382540747756138, 0.0340626831166446, 0.624421828892082, 0.179525560466573, 
0.884322474710643, 0.548561444506049, 0.0785303884185851), y = c(2.22677680454217, 
3.11262330505997, 2.2728122510016, 3.08046812936664, 3.15304983314127, 
4.43124474911019, 1.85952415782958, 4.28254768694751, 2.72436442878097, 
1.10656954837032)), .Names = c("x1", "x2", "x3", "y"), row.names = c(NA, 
-10L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x7fbe1a050b78>)) 

Coefficients: 
(Intercept)   x1   x2 
    0.8162  3.1206  1.6057 


[[2]] 

Call: glm(formula = y ~ x1 + x3, family = "gaussian", data = structure(list(
    x1 = c(0.563014860032126, 0.211994701065123, 0.174694777932018, 
    0.135693877004087, 0.460017150267959, 0.736233349423856, 
    0.63450039527379, 0.652027820236981, 0.467176814330742, 0.148995384806767 
    ), x2 = c(0.0307870297692716, 0.601646583992988, 0.812958373920992, 
    0.698285705409944, 0.907962741097435, 0.75469194049947, 0.0430496339686215, 
    0.0829190369695425, 0.109014765359461, 0.33699565846473), 
    x3 = c(0.412113963160664, 0.432729347608984, 0.0741072639357299, 
    0.382540747756138, 0.0340626831166446, 0.624421828892082, 
    0.179525560466573, 0.884322474710643, 0.548561444506049, 
    0.0785303884185851), y = c(2.22677680454217, 3.11262330505997, 
    2.2728122510016, 3.08046812936664, 3.15304983314127, 4.43124474911019, 
    1.85952415782958, 4.28254768694751, 2.72436442878097, 1.10656954837032 
    )), .Names = c("x1", "x2", "x3", "y"), row.names = c(NA, 
-10L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x7fbe1a050b78>)) 

Coefficients: 
(Intercept)   x1   x3 
    1.6630  0.6408  2.4483 

Degrees of Freedom: 9 Total (i.e. Null); 7 Residual 
Null Deviance:  9.518 
Residual Deviance: 4.269 AIC: 27.87 

這是直截了當地深入到你的結果。如果我稱這個對象爲x,則例如x[[1]]是第一款適合的型號,我可以使用標準功能套件與它進行交互:

> coefficients(x[[1]]) 
(Intercept)   x1   x2 
    0.8162068 3.1205587 1.6057346