2014-02-25 57 views
0

這是把我的數據進行一系列計算:子集數據,並在for循環

C1  C2  C3  C4  C5 C6 C7 C8 
ATOM 1 -4.794 -7.29 6.756 C 12 1 
ATOM 1 -4.357 -6.181 6.473 O 16 1 
ATOM 2 -5.279 -7.475 5.986 C 12 1 
ATOM 2 -7.564 -8.809 6.984 C 12 1 
ATOM 2 -5.822 -7.105 7.238 C 12 1 
ATOM 1 -7.515 -10.402 -0.621 C 12 2 
ATOM 1 -7.26 -11.716 -0.22 O 16 2 
ATOM 1 -8.163 -9.682 0.566 C 12 2 
ATOM 2 -6.347 -9.475 -1.255 C 12 2 
ATOM 1 -7.302 -8.048 7.702 C 12 3 
ATOM 1 -7.676 -8.93 6.667 C 12 3 
ATOM 2 -6.864 -9.118 5.529 C 12 3 

我的目標是基於子集列C8的內容數據和運行一系列使用循環計算。目前,我正在做手工運行:

sub.1 <- subset(data, C8 == 1) 
result.1 <- within(data, { 
    multiply.z <- C5 * C7 
    multiply.y <- C4 * C7 
    multiply.x <- C3 * C7 
    Center.z <- sum(multiply.z)/sum(C7) 
    Center.y <- sum(multiply.z)/sum(C7) 
    Center.x <- sum(multiply.z)/sum(C7) 
    #rm(multiply.z,multiply.y,multiply.x) 
}) 

sub.2 <- subset(data, C8 == 2) 
result.2 <- same code as above 
sub.3 <- subset(data, C8 == 3) 
result.3 <- same code as above 

我試圖用一個for循環來自動執行上述操作,但它不工作。這是我的代碼:

for (i in 1:max(C8)){ 
sub.i <- subset(data, C8 == i) 
result.i <- within(sub.i, { 
    multiply.z <- C5 * C7 
    multiply.y <- C4 * C7 
    multiply.x <- C3 * C7 
    Center.z <- sum(multiply.z)/sum(C7) 
    Center.y <- sum(multiply.z)/sum(C7) 
    Center.x <- sum(multiply.z)/sum(C7) 
    #rm(multiply.z,multiply.y,multiply.x) 
})} 

我將不勝感激關於如何解決此問題的任何幫助或建議。先謝謝你!

+0

你覆蓋環'result.i'的每次迭代 – josliber

回答

1

您一定要仔細看看library(plyr),其中包含您希望用於此類任務的所有工具。它允許你使用功能ddply完成這個任務,

library(plyr) 

mySubs <- ddply(dat, .(C8), .fun = function(x) { 
    multiply.z = x$C5 * x$C7 
    multiply.y = x$C4 * x$C7 
    multiply.x = x$C3 * x$C7 
    Center.z = sum(multiply.z)/sum(x$C7) 
    Center.y = sum(multiply.z)/sum(x$C7) 
    Center.x = sum(multiply.z)/sum(x$C7) 
    ##rm(multiply.z,multiply.y,multiply.x) 
    data.frame(z = Center.z, y = Center.y, x = Center.x) 
}) 

返回一個data.frame這裏(ddply是子集data.frames並返回data.frames功能,通過它的名字「DD」暗示,plyr函數使用的約定)。

> mySubs 
    C8   z   y   x 
1 1 6.674000 6.674000 6.674000 
2 2 -0.370000 -0.370000 -0.370000 
3 3 6.632667 6.632667 6.632667 

等等

1

這裏有一個data.table解決方案:

library(data.table) 

DT <- data.table(df) 

DT[, 
    structure(
     lapply(list(C5, C4, C3), function(x) sum(x * C7)/sum(C7)), 
     names = c("z", "y", "x") 
     ) 
    , by = C8] 

## C8   z   y   x 
## 1: 1 6.674000 -7.297562 -5.487813 
## 2: 2 -0.370000 -10.426231 -7.316538 
## 3: 3 6.632667 -8.698667 -7.280667