2014-10-07 28 views
0

我正在試圖爲我的面板數據添加一個新的12個月滯後變量的Plasma_mean變量。 PLasma_mean數據在其他觀測值之前12個月開始,因此數據集頭中的其他變量的NA也是如此。在不平衡面板數據中存在12個月的延遲

ProdGrp timeperiod Plasma_mean Mark.Invest_mean Reps_mean repcost_mean Sales_sum  Pcs_vol_sum 
    1:    1/1/2003  948881    NA  NA   NA  NA    NA 
    2:    2/1/2003  787974    NA  NA   NA  NA   NA 
    3:    3/1/2003  872733    NA  NA   NA  NA   NA 
    4:    4/1/2003  932405    NA  NA   NA  NA   NA 
    5:    5/1/2003  922127    NA  NA   NA  NA   NA 
---                         
155: Product A 4/1/2010  1325862   36362.49  1.33  14436.66 168874.9    718 
156: Product B 5/1/2010  1253672   53821.38  8.17  14336.67 1989798.9  4549 
157: Product A 5/1/2010  1253672   37146.27  1.33  14436.66 152519.5   596 
158: Product B 6/1/2010  1334744   69749.48  8.17  14336.67 1978877.4  4612 
159: Product A 6/1/2010  1334744   38093.63  1.33  14436.66 164404.0   689 

gProt_vol_sum pckg_price_mean g_Prot_price_mean TotalpharmaBiosales_mean  dollarized_reps_mean  dates 
    1:   NA    NA    NA      NA     NA 2003-01-01 
    2:   NA    NA    NA      NA     NA 2003-02-01 
    3:   NA    NA    NA      NA     NA 2003-03-01 
    4:   NA    NA    NA      NA     NA 2003-04-01 
    5:   NA    NA    NA      NA     NA 2003-05-01 
---                           
    155:  2378.5  191.0250   76.88328     6023500    19200.76 2010-04-01 
    156:  40109.5  288.6149   49.80379     6135394   30.59 2010-05-01 
    157:  2204.0  187.4431   76.11616     6135394    19200.76 2010-05-01 
    158:  41776.0  298.1715   55.74162     8673498   117130.59 2010-06-01 
    159:  2305.5  190.6980   76.77850     8673498    19200.76 2010-06-01 
      plasma_lagged 
     1:   NA 
     2:   NA 
     3:   NA 
     4:   NA 
     5:   NA 
    ---    
    155:   NA 
    156:   NA 
    157:   NA 
    158:   NA 
    159:   NA 

使用data.frame包,我所做的:

lag <- function(Plasma_mean, n = 12L, along_with){ 
+ index <- match(along_with - n, along_with, incomparable = NA) 
+ out <- Plasma_mean[index] 
+ attributes(out) <- attributes(Plasma_mean) 
+ out 
+ } 

,然後由產品組

DT[, plasma_lagged := lag(Plasma_mean, 12, along_with = dates), by = ProdGrp] 

它連接到我的數據集,我在最後得到了plasma_lagged變量我的數據集的列。但它似乎沒有數據。 (觀察155的觀察結果)。

任何提示如何解決這將是很好的。

H

+0

你做了什麼調試?你的'滯後'函數是否可以用於單個'ProdGrp'的一部分數據? – Gregor 2014-10-07 18:41:46

回答

0

您落後了12天而不是12個月。嘗試

library(lubridate) 
DT[, plasma_lagged := lag(Plasma_mean, months(12), along_with = dates), by = ProdGrp] 

(並請給一個reproductible例如代碼,否則我無法保證這個工程)。

相關問題