2011-11-10 17 views
2

我有一個非常大的數據集減去大data.frame小data.frame按分組變量

mdf <- data.frame (sn = 1:40, var = rep(1:10, 4), block = rep(1:4, each = 10), 
yld = c(1:40)) 

我小的數據在這兩個數據集設置

blockdf <- data.frame(block = 1:4, yld = c(10, 20, 30, 40)) # block means 

所有變量,除了是因素。

我想從每個mdf $ yld數據集中減去塊方法(blockdf $ yld),使得塊效應應該對應於mdf數據幀中的塊。

for example: value 10 will be substracted from all var within 
    first block yld in mdf 
        20 - second block yld in mdf 

and so on 

請注意,我可能有一些不平衡的代表var內的數量。所以我想它寫在這樣的方式,它可以處理不平衡情況

回答

4

這應該做的伎倆

block_match <- match(mdf$block, blockdf$block) 
transform(mdf, yld = yld - blockdf[block_match, 'yld']) 
4

這應該工作

newdf <- merge(x=mdf, y=blockdf, by="block", suffixes = c("",".blockmean")) 
newdf$newvr <- newdf$yld-newdf$yld.blockmean 
print(newdf, row.names=FALSE) 
    block sn var yld yld.blockmean newvr 
1 1 1 1   10 -9 
1 2 2 2   10 -8 
1 3 3 3   10 -7 
1 4 4 4   10 -6 
1 5 5 5   10 -5 
1 6 6 6   10 -4 
1 7 7 7   10 -3 
1 8 8 8   10 -2 
1 9 9 9   10 -1 
1 10 10 10   10  0 
2 11 1 11   20 -9 
2 12 2 12   20 -8 
...........................