Colwise吃ddply中的列名

我試圖通過數據框分塊，找到子數據框不平衡的情況下，併爲缺少的因素的某些級別添加0值。爲此，在ddply中，我快速比較了一個因子應該在哪個級別的集合向量，然後創建一些新行，複製子數據集的第一行但修改它們的值，然後對它們進行rbinding到舊的數據集。Colwise吃ddply中的列名

我使用colwise來執行復制。

這在ddply以外很好用。在ddply裏面...識別行被吃掉了，並且在我的上面咬了一下。這是好奇的行爲。看到下面的代碼與拋出一些調試打印語句，看看結果的差異：

#a test data frame 
g <- data.frame(a=letters[1:5], b=1:5) 

#repeat rows using colwise 
rep.row <- function(r, n){ 
    colwise(function(x) rep(x, n))(r) 
} 

#if I want to do this with just one row, I get all of the columns 
rep.row(g[1,],5)

是好的。它打印

a b 
1 a 1 
2 a 1 
3 a 1 
4 a 1 
5 a 1 

#but, as soon as I use ddply to create some new data 
#and try and smoosh it to the old data, I get errors 
ddply(g, .(a), function(x) { 

    newrows <- rep.row(x[1,],5) 
    newrows$b<-0 
    rbind(x, newrows) 

})

這給

Error in rbind(deparse.level, ...) : 
    numbers of columns of arguments do not match

你可以看到問題與此調試版本

#So, what is going on here? 
ddply(g, .(a), function(x) { 
    newrows <- rep.row(x[1,],5) 
    newrows$b<-0 
    print(x) 
    print("\n\n") 
    print(newrows) 
    rbind(x, newrows) 

})

可以看出，x和newrows有不同的列 - 他們的不同。

a b 
1 a 1 
[1] "\n\n" 
    b 
1 0 
2 0 
3 0 
4 0 
5 0 
Error in rbind(deparse.level, ...) : 
    numbers of columns of arguments do not match

這是怎麼回事？爲什麼當我在子數據框上使用colwise時，識別的行會被吃掉？

來源

2013-05-31 jebyrnes

這似乎是ddply和colwise之間的一個有趣的互動。更具體地說，當colwise調用strip_splits並發現ddply給出的vars屬性時，會發生此問題。

作爲一種變通方法，嘗試把這個第一行中的功能，

attr(x, "vars") <- NULL 
    # your code follows 
    newrows <- rep.row(x[1,],5)

來源

2013-05-31 01:23:42 baptiste

Colwise吃ddply中的列名

回答

相關問題