2017-05-11 80 views
3

使用R中的plm包來擬合固定效應模型,將滯後變量添加到模型的正確語法是什麼?類似於Stata中的'L1.variable'命令。R plm lag - 在Stata中相當於L1.x的值是多少?

這是我嘗試添加一個滯後變量(這是一個測試模型,它可能沒有什麼意義):

library(foreign) 
nlswork <- read.dta("http://www.stata-press.com/data/r11/nlswork.dta") 
pnlswork <- plm.data(nlswork, c('idcode', 'year')) 
ffe <- plm(ln_wage ~ ttl_exp+lag(wks_work,1) 
      , model = 'within' 
      , data = nlswork) 
summary(ffe) 

右輸出:

Oneway (individual) effect Within Model 

Call: 
plm(formula = ln_wage ~ ttl_exp + lag(wks_work), data = nlswork, 
    model = "within") 

Unbalanced Panel: n=3911, T=1-14, N=19619 

Residuals : 
    Min. 1st Qu. Median 3rd Qu.  Max. 
-1.77000 -0.10100 0.00293 0.11000 2.90000 

Coefficients : 
       Estimate Std. Error t-value Pr(>|t|)  
ttl_exp  0.02341057 0.00073832 31.7078 < 2.2e-16 *** 
lag(wks_work) 0.00081576 0.00010628 7.6755 1.744e-14 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Total Sum of Squares: 1296.9 
Residual Sum of Squares: 1126.9 
R-Squared:  0.13105 
Adj. R-Squared: -0.085379 
F-statistic: 1184.39 on 2 and 15706 DF, p-value: < 2.22e-16 

但是,我得到了不同的結果相比,什麼Stata生產。

在我的實際模型中,我想用滯後的價值來衡量一個內生變量。

謝謝!

作爲參考,這裏是Stata的代碼:

webuse nlswork.dta 
xtset idcode year 
xtreg ln_wage ttl_exp L1.wks_work, fe 

Stata的輸出:

Fixed-effects (within) regression    Number of obs  =  10,680 
Group variable: idcode       Number of groups =  3,671 

R-sq:           Obs per group: 
    within = 0.1492           min =   1 
    between = 0.2063           avg =  2.9 
    overall = 0.1483           max =   8 

               F(2,7007)   =  614.60 
corr(u_i, Xb) = 0.1329       Prob > F   =  0.0000 

------------------------------------------------------------------------------ 
    ln_wage |  Coef. Std. Err.  t P>|t|  [95% Conf. Interval] 
-------------+---------------------------------------------------------------- 
    ttl_exp | .0192578 .0012233 15.74 0.000  .0168597 .0216558 
      | 
    wks_work | 
     L1. | .0015891 .0001957  8.12 0.000  .0012054 .0019728 
      | 
     _cons | 1.502879 .0075431 199.24 0.000  1.488092 1.517666 
-------------+---------------------------------------------------------------- 
    sigma_u | .40678942 
    sigma_e | .28124886 
     rho | .67658275 (fraction of variance due to u_i) 
------------------------------------------------------------------------------ 
F test that all u_i=0: F(3670, 7007) = 4.71     Prob > F = 0.0000 

回答

2

lag()因爲它是在plm滯後於觀測逐行無 「尋找」 的時間可變,即它將變量(每個人)移動。如果時間維度存在空白,則可能需要考慮時間變量的值。有(現在)未提交函數plm:::lagt.pseries考慮了時間變量,因此可以處理數據中的空白,就像您期望的那樣。

library(plm) 
library(foreign) 
nlswork <- read.dta("http://www.stata-press.com/data/r11/nlswork.dta") 
pnlswork <- pdata.frame(nlswork, c('idcode', 'year')) 
ffe <- plm(ln_wage ~ ttl_exp + plm:::lagt.pseries(wks_work,1) 
      , model = 'within' 
      , data = pnlswork) 
summary(ffe) 

Oneway (individual) effect Within Model 

Call: 
plm(formula = ln_wage ~ ttl_exp + plm:::lagt.pseries(wks_work, 
    1), data = nlswork, model = "within") 

Unbalanced Panel: n=3671, T=1-8, N=10680 

Residuals : 
    Min. 1st Qu. Median 3rd Qu. Max. 
-1.5900 -0.0859 0.0000 0.0957 2.5600 

Coefficients : 
            Estimate Std. Error t-value Pr(>|t|)  
ttl_exp       0.01925775 0.00122330 15.7425 < 2.2e-16 *** 
plm:::lagt.pseries(wks_work, 1) 0.00158907 0.00019573 8.1186 5.525e-16 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Total Sum of Squares: 651.49 
Residual Sum of Squares: 554.26 
R-Squared:  0.14924 
Adj. R-Squared: -0.29659 
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16 

Btw1:

如下用它更好地利用pdata.frame(),而不是plm.data()。 Btw2:您可以在您的數據與檢查間隙PLM的is.pconsecutive()

is.pconsecutive(pnlswork) 
all(is.pconsecutive(pnlswork)) 

您也可以連續的數據,然後再使用lag(),像這樣:

pnlswork2 <- make.pconsecutive(pnlswork) 
pnlswork2$wks_work_lag <- lag(pnlswork2$wks_work) 
ffe2 <- plm(ln_wage ~ ttl_exp + wks_work_lag 
      , model = 'within' 
      , data = pnlswork2) 
summary(ffe2) 

Oneway (individual) effect Within Model 

Call: 
plm(formula = ln_wage ~ ttl_exp + wks_work_lag, data = pnlswork2, 
    model = "within") 

Unbalanced Panel: n=3671, T=1-8, N=10680 

Residuals : 
    Min. 1st Qu. Median 3rd Qu. Max. 
-1.5900 -0.0859 0.0000 0.0957 2.5600 

Coefficients : 
       Estimate Std. Error t-value Pr(>|t|)  
ttl_exp  0.01925775 0.00122330 15.7425 < 2.2e-16 *** 
wks_work_lag 0.00158907 0.00019573 8.1186 5.525e-16 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Total Sum of Squares: 651.49 
Residual Sum of Squares: 554.26 
R-Squared:  0.14924 
Adj. R-Squared: -0.29659 
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16 

或者乾脆:

ffe3 <- plm(ln_wage ~ ttl_exp + lag(wks_work) 
      , model = 'within' 
      , data = pnlswork2) # note: it is the consecutive panel data set here 
summary(ffe3) 

Oneway (individual) effect Within Model 

Call: 
plm(formula = ln_wage ~ ttl_exp + lag(wks_work), data = pnlswork2, 
    model = "within") 

Unbalanced Panel: n=3671, T=1-8, N=10680 

Residuals : 
    Min. 1st Qu. Median 3rd Qu. Max. 
-1.5900 -0.0859 0.0000 0.0957 2.5600 

Coefficients : 
       Estimate Std. Error t-value Pr(>|t|)  
ttl_exp  0.01925775 0.00122330 15.7425 < 2.2e-16 *** 
lag(wks_work) 0.00158907 0.00019573 8.1186 5.525e-16 *** 
--- 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Total Sum of Squares: 651.49 
Residual Sum of Squares: 554.26 
R-Squared:  0.14924 
Adj. R-Squared: -0.29659 
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16 
+0

謝謝,Helix123!有沒有辦法使用plm/lfe獲取Stata的'simga_u'? 我讀過你的評論:[link] https://stats.stackexchange.com/a/228806/,但無法複製結果,我得到了這個錯誤:'non-conformable arguments' 此外,R-sq:內,之間,總體? –

+0

這些問題同時存在很多問題......爲了在模型內部的鏈接中複製我的答案,您需要從數據中刪除截距,即在第一行之後插入X < - X [,-1]。 – Helix123