2016-10-28 71 views
1

我有這樣的數據集:如何格式化R中的describeBy表?

Defects.I Defects.D  Treatment 
    1   2    A 
    1   3    B 

和我想要做的檢測和隔離的缺陷,每處理一個分組的描述性統計。 搜索了一段時間後,我在名爲describeBy()的psych庫上找到了一個很好的函數。 用下面的代碼:

describeBy(myData[1:2],myData$Treatment) 

我得到這樣的輸出:

Treatment A   
        Mean. Median. Trimed. 
    Defects.I  x  x   x 
    Defects.D  x  x   x 

Treatment B   
        Mean. Median. Trimed. 
    Defects.I  x  x   x 
    Defects.D  x  x   x 

但在現實中我一直在尋找類似

    Mean. Median. Trimed. 
        A B  A B  A B 
    Defects.I  x x  x x  x x 
    Defects.D  x x  x x  x x 

數據

myData <- structure(list(Defects.I = c(1L, 1L), Defects.D = 2:3, Treatment = c("A", 
"B")), .Names = c("Defects.I", "Defects.D", "Treatment"), class = "data.frame", row.names = c(NA, 
-2L)) 
+1

'l < - psych :: describeBy(myData [1:2],myData $ Treatment); do.call('cbind',l)[,order(sequence(lengths(l)))]' – rawr

+0

@rawr這正是我期待的!你能把它作爲答案發布嗎?如果您提出1或2條評論,這將非常棒:) –

回答

2

由於describeBy返回數據幀的列表,我們可以只cbind他們,但也不至於得到正確的訂單。相反,我們可以交錯列

myData <- structure(list(Defects.I = c(1L, 1L), Defects.D = 2:3, 
         Treatment = c("A", "B")), 
        .Names = c("Defects.I", "Defects.D", "Treatment"), 
        class = "data.frame", row.names = c(NA, -2L)) 

l <- psych::describeBy(myData[1:2], myData$Treatment) 

所以交錯使用這個命令

order(sequence(c(ncol(l$A), ncol(l$B)))) 
# [1] 1 14 2 15 3 16 4 17 5 18 6 19 7 20 8 21 9 22 10 23 11 24 12 25 13 26 

,而不是單獨什麼cbind會做

c(1:13, 1:13) 
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13 

所以這

do.call('cbind', l)[, order(sequence(lengths(l)))] 
#   A.vars B.vars A.n B.n A.mean B.mean A.sd B.sd A.median B.median A.trimmed B.trimmed A.mad B.mad 
# Defects.I  1  1 1 1  1  1 NA NA  1  1   1   1  0  0 
# Defects.D  2  2 1 1  2  3 NA NA  2  3   2   3  0  0 
#   A.min B.min A.max B.max A.range B.range A.skew B.skew A.kurtosis B.kurtosis A.se B.se 
# Defects.I  1  1  1  1  0  0  NA  NA   NA   NA NA NA 
# Defects.D  2  3  2  3  0  0  NA  NA   NA   NA NA NA 

Ø r作爲函數

interleave <- function(l, how = c('cbind', 'rbind')) { 
    how <- match.arg(how) 
    if (how %in% 'rbind') 
    do.call(how, l)[order(sequence(sapply(l, nrow))), ] 
    else do.call(how, l)[, order(sequence(sapply(l, ncol))), ] 
} 

interleave(l) 
#   A.vars B.vars A.n B.n 
# Defects.I  1  1 1 1 
# Defects.D  2  2 1 1 ... 
# ... 

interleave(l, 'r') 
#    vars n mean sd median trimmed mad min max range skew kurtosis se 
# A.Defects.I 1 1 1 NA  1  1 0 1 1  0 NA  NA NA 
# B.Defects.I 1 1 1 NA  1  1 0 1 1  0 NA  NA NA 
# A.Defects.D 2 1 2 NA  2  2 0 2 2  0 NA  NA NA 
# B.Defects.D 2 1 3 NA  3  3 0 3 3  0 NA  NA NA 
+0

謝謝您的回答!只有一個小問題可以選擇顯示哪個統計量? –

+1

@ p3rand0r似乎有一些選項從'describeBy'傳遞給'describe',所以你可以這樣做'describeBy(...,skew = FALSE)'來停止偏度/峯度,但我可能只是在'il < - 交錯(l); il [,grep('me | tr',names(il))]' – rawr

+0

太棒了!這些實際上是我想要刪除的,謝謝! –

1

你可以嘗試t他mat = TRUE的說法。這不正是你要尋找的,但它更接近:

library(psych) 
mydata = data.frame(Defects.I = c(1,1), Defects.D = c(2,3), Treatment = c('A','B')) 

    describeBy(mydata[1:2], mydata$Treatment, mat = TRUE) 

  item group1 vars n mean sd median trimmed mad min max range skew kurtosis se 
Defects.I1 1  A 1 1 1 NA  1  1 0 1 1  0 NA  NA NA 
Defects.I2 2  B 1 1 1 NA  1  1 0 1 1  0 NA  NA NA 
Defects.D1 3  A 2 1 2 NA  2  2 0 2 2  0 NA  NA NA 
Defects.D2 4  B 2 1 3 NA  3  3 0 3 3  0 NA  NA NA 
+0

感謝您的幫助,但正如您可以看到自己的格式仍然不同,因爲如果可能,我希望治療位於頂端 –