2014-02-21 106 views
0

我有一個數據幀,如下給出:runSD具有空值

> dput(head(dt[,c("IBC","FYEAR","GVKEY")],20)) 
structure(list(IBC = c(1.138, 2.576, NA, 0.236, 0.793, -0.525, -7.838, -2.554, 9.071, 11.506, 15.361, 21.233, 24.814, 25.655, NA, 10.02, 0.283, 9.484, 10.463, 16.012), 
       FYEAR = c(1984L, 1985L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L), GVKEY = c(1001L, 1001L, 1003L, 1003L, 1003L, 1003L, 1003L, 1003L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L)), 
       .Names = c("IBC", "FYEAR", "GVKEY"), 
       row.names = c(NA, 20L), class = "data.frame") 

下面的代碼生成標準偏差爲列稱爲IBC 4個元件的每個連續序列。但是,如果存在NULL值,則會出現錯誤。如何修改以下代碼以爲NULL值騰出空間?

dt <- dt[order(dt$GVKEY,dt$FYEAR),] 
dt$STDEARN <- ave(dt$IBC, dt$GVKEY, FUN = function(x) { 
        if(length(x)>3) c(NA,head(runSD(x,4),-1)) 
        else sample(NA,length(x),TRUE) 
       }) 
+0

一個原子向量不能包含'NULL'值。請提供一個可重複的例子。 –

+0

提供了可重現的例子。 – Sumit

+0

如果數據包含「NA」,那麼'runSD'的期望輸出是什麼? –

回答

1

我建議使用的rollapply代替runSD,因爲你可以使用na.rm = TRUE

dt <- dt[order(dt$GVKEY,dt$FYEAR),] 

library(xts) 
transform(dt, STDEARN = ave(IBC, GVKEY, FUN = function(x) 
          if (length(x) > 3) 
           c(rep(NA, 3), head(rollapply(x, 4, sd, 
                  na.rm = TRUE, 
                  fill = NA), -3)) 
          else NA)) 

     IBC FYEAR GVKEY STDEARN 
1 1.138 1984 1001  NA 
2 2.576 1985 1001  NA 
3  NA 1984 1003  NA 
4 0.236 1985 1003  NA 
5 0.793 1986 1003  NA 
6 -0.525 1987 1003  NA 
7 -7.838 1988 1003 0.661626 
8 -2.554 1989 1003 4.039287 
9 9.071 1984 1004  NA 
10 11.506 1985 1004  NA 
11 15.361 1986 1004  NA 
12 21.233 1987 1004  NA 
13 24.814 1988 1004 5.302228 
14 25.655 1989 1004 5.938866 
15  NA 1990 1004 4.680553 
16 10.020 1991 1004 2.348224 
17 0.283 1992 1004 8.794155 
18 9.484 1993 1004 12.799745 
19 10.463 1994 1004 5.473495 
20 16.012 1995 1004 4.869479