2016-11-11 60 views
0

我想生成行列列總數的交叉表。我試圖用gmodels包生成交叉表。輸出的外觀比普通表格功能要好。桌子的外觀很重要,因爲最後必須使用Shiny來顯示。但問題是我在行和列的末尾獲得了列總數和行總數。我怎樣才能得到總列作爲表中的第一列和第一列。閃亮 - 以第一行/列生成列總數和行總數的交叉表

以下是我的數據示例。

Location <- sample(c("location A","location B","location C","location D","location E"),20,replace = T) 
Brand <- sample(c("Brand A","Brand B","Brand C"),20,replace = T) 
Year <- rep(c("Year 2014","Year 2015"),10) 
Q1 <- sample(1:5,20,replace = T) 
Q2 <- sample(1:5,20,replace = T) 

mydata <- as.data.table(cbind(Location,Brand,Year,Q1,Q2)) 

數據很龐大,因此它是data.table。我使用用於產生交叉表

代碼爲 -

library("gmodels") 

mydata[,CrossTable(Location,Brand,prop.c = T,prop.r = F,prop.t = F,prop.chisq = F,chisq = F,format = "SPSS")] 

這給出了輸出,但總的列是列中的行和結束的結束。列的總數也缺少列%。我如何將總列作爲第一行和第一列,並且還有%?

建議出路。

+0

你可能不想'cbind'在這裏。看看'str(mydata)'並注意到所有的cols都被強制爲字符串/字符類型。也許你想'reshape2 :: dcast(mydata,Location〜Brand,margin = TRUE)'在這裏? – Frank

+0

既然'CrossTable'返回null,那麼你唯一的選擇就是根據你的需要修改它的源代碼。 –

回答

0

你有沒有嘗試使用sjPlot包....它有一個非常好的功能,sjt.xtab產生交叉表(列聯表),類似於你在找什麼。它有很多選項可供探索。我在下面使用了其中的幾個。您可以查看?sjt.xtab並查看其他可用選項。下面的代碼生成具有列百分比的表輸出並且具有總列和行。

sjt.xtab(mydata$Location, mydata$Brand, 
     show.col.prc = T, 
     show.summary = F, 
     show.na = F, 
     wrap.labels = 50, 
     tdcol.col = "#f90470", 
     emph.total = T, 
     emph.color = "#3aaee5", 
     use.viewer = T, 
     CSS = list(css.table = "border: 1px solid;", 
        css.tdata = "border: 1px solid;")) 
+0

我已經找到關於sjPlot包並在這種情況下使用它。是的,這是相當有用的,並符合要求只有東西不給總第1行和第1列。但仍然比其他表格輸出更好。我錯過了發佈答案,並感謝您發佈它。 – user1412

0

也許這樣的事情可能會做?

myCT <- function(mydata) { 
    mydata_ct_n <- dcast.data.table(mydata, Location ~ Brand, margins = T) 
    mydata_ct_n[, all := rowSums(.SD), by = Location] 
    mydata_ct_n <- rbind(mydata_ct_n[, lapply(.SD, sum), .SDcols = 2:ncol(mydata_ct_n)], mydata_ct_n, fill = T) 
    mydata_ct_n$Location[1] <- "all" 
    foocols <- c("all", "Location") 
    setcolorder(mydata_ct_n, c(foocols, setdiff(colnames(mydata_ct_n), foocols))) 

    mydata_ct_p <- copy(mydata_ct_n) 
    for (j in 3:ncol(mydata_ct_p)) { 
    set(mydata_ct_p, j = j, value = as.numeric(mydata_ct_p[[j]])) 
    set(mydata_ct_p, i = 2:nrow(mydata_ct_p), j = j, value = round(100 * mydata_ct_p[2:nrow(mydata_ct_p), j, with = F]/mydata_ct_p[[j]][1], 0)) 
    } 
    set(mydata_ct_p, 1L, 3L:ncol(mydata_ct_p), round(100 * mydata_ct_p[1L, 3L:ncol(mydata_ct_p), with = F]/mydata_ct_p[["all"]][1], 0)) 

    for (j in 3:ncol(mydata_ct_p)) { 
    set(mydata_ct_p, j = j, value = as.character(mydata_ct_p[[j]])) 
    set(mydata_ct_n, j = j, value = as.character(mydata_ct_n[[j]])) 
    set(mydata_ct_p, j = j, 
     value = paste0(mydata_ct_p[[j]], "% (", mydata_ct_n[[j]], ")")) 
    } 
    return(mydata_ct_p) 
} 

Location <- sample(c("location A","location B","location C","location D","location E"),20,replace = T) 
Brand <- sample(c("Brand A","Brand B","Brand C"),20,replace = T) 
Year <- rep(c("Year 2014","Year 2015"),10) 
Q1 <- sample(1:5,20,replace = T) 
Q2 <- sample(1:5,20,replace = T) 
mydata <- as.data.table(cbind(Location,Brand,Year,Q1,Q2)) 

out <- myCT(mydata) 
print(out) 
# all Location Brand A Brand B Brand C 
# 1: 20  all 30% (6) 35% (7) 35% (7) 
# 2: 3 location A 0% (0) 43% (3) 0% (0) 
# 3: 5 location B 33% (2) 14% (1) 29% (2) 
# 4: 5 location C 50% (3) 0% (0) 29% (2) 
# 5: 4 location D 17% (1) 29% (2) 14% (1) 
# 6: 3 location E 0% (0) 14% (1) 29% (2) 
+0

這很有趣。不過,我需要爲相當多的變量生成這樣的交叉表輸出,並且這將是大約40-50個這樣的輸出。這將是一個複製的長代碼,因此正在gmodels包中尋找一些包OR解決方案。 – user1412

+0

如果你使它成爲一個函數;),讓我編輯:) –

+0

正如提到想在Shiny中生成這個。該代碼不會在Shiny中生成所需的輸出 – user1412