2014-09-01 48 views
0

我想回歸一個基線規格變量和隨後七個額外的變量(即8迴歸)。我想爲數據框架的兩個子集和附加變量的兩個子集執行此操作。迴歸與子集和基準規格+不同的變量

然後,我想通過子集組合(所以4個文件)將這些8x2x2 = 32迴歸的輸出保存在stargazer中。你可以想象這是一個巨大的打字努力。 SO上的一些答案與此有關(例如使用ddply,但我與組合鬥爭,特別是在每個迴歸中基線變量保持不變的事實

這裏是我減少基線變量控制)和附加變量兩個:

Two.Year <- 1:4 
Length <- 4:8 
NumAck <- 8:12 
degree_max <- 15:19 
degree_median <- 16:20 
katz_max <- 19:23 
katz_median <- 23:27 
Year <- rep(c("early","late"), each=2) 

Master <-as.data.frame(cbind(
Two.Year, Length, NumAck, degree_max, degree_median, katz_max, katz_median, Year 
)) 

的兩個子集是通過在可變年水平定義

基線迴歸是

lm(Two.Year ~ Length + NumAck, 
    Data=subset(Master, subset=Publication.Year==early) 
) 

第二個和第三個都是以_max結尾的變量,所以lm(Two.Year ~ Length + NumAck + degree_max, Data=Master)lm(Two.Year ~ Length + NumAck + katz_max, Data=Master)。這給出第二個子集,定義爲所有以_max結尾的變量和以_median結尾的變量。到目前爲止,我提取這些與grepl("_median", names(Master))grepl("_max", names(Master))

如前所述,我想保存分組子集的輸出。也就是說,(I)早期和最大值,(II)早期和中值,(III)晚期和最大值以及(IV)晚和中值的所有迴歸。

到目前爲止,我試過

Master.subset <- split(Master, Master$time) 
ols <- ddply(Master[ Master$time %in% c('early','late'), ], "time", 
    function(Master) coefficients(lm(Two.Year~., data=Master))) 

,然後保存醇與stargazer()。從這裏我不知道如何選擇除構建真實數據框子集之外的其他變量子集,也不知道如何使用基線變量。

我該怎麼做?任何提示都非常感謝!

回答

0

這是我最後做的。

首先我定義用於三個環路變量:

# Write baseline model 
baseline <- "Two.Year ~ Length + numAck" 
# Write the model specific variables 
measure <- c("degree", "katz") 
# Write variable which determines the subset 
timepoints <- c("early", "late") 

的I contstructed三個嵌套循環,使得可變子集,所述子集data.frame和式子集被適當的環內定義。在第二個循環結束時,我將輸出寫入文件大小爲stargazer

# Output matrix 
ols <- matrix() 
# First loop for the time 
for (tmpnt in considered_time) { 
    # Estimate baseline model which is constant in tmpnt 
    ols$baseline <- lm(as.formula(baseline), 
    data=subset(Master, timepoint==tmpnt) 
    ) 
    ols <- ols[-1] # for some reason the first column in ols is empty 
    # Second loop for the variable subset 
    for (type in c("median", "max", "mean")) { 
    # Third loop for the estimation of all the other models 
    for (msr in measures) { 
     ols[[msr]] = lm(as.formula(paste(baseline, paste(msr, type, sep="_"), sep="+")), 
     data=subset(Master, timepoint==tmpnt) 
     ) 
     } 
    # Write output to file 
    stargazer(ols, 
    title=paste("Regression output for ",tmpnt," subsample using ",type),  
    ) 
    } 
}