R選擇問卷數據ggplot

我有一個Qualtrics多選題，我想用它來在R中創建圖表。我的數據是組織的，因此您可以爲每個問題回答多個答案。例如，參與者1選擇了多選答案1（Q1_1）& 3（Q1_3）。我想在一個條形圖中摺疊所有答案選項，每個多重答案選項（Q1_1：Q1_3）的一個條數除以回答此問題的答覆者數（此例中爲3）。R選擇問卷數據ggplot

df <- structure(list(Participant = 1:3, A = c("a", "a", ""), B = c("", "b", "b"), C = c("c", "c", "c")), .Names = c("Participant", "Q1_1", "Q1_2", "Q1_3"), row.names = c(NA, -3L), class = "data.frame")

我想使用ggplot2，也許通過Q1_1某種循環：Q1_3？

來源

2016-10-04 lmcshane

也許這就是你想要的

f <- 
    structure(
    list(
     Participant = 1:3, 
     A = c("a", "a", ""), 
     B = c("", "b", "b"), 
     C = c("c", "c", "c")), 
    .Names = c("Participant", "Q1_1", "Q1_2", "Q1_3"), 
    row.names = c(NA, -3L), 
    class = "data.frame" 
) 


library(tidyr) 
library(dplyr) 
library(ggplot2) 

nparticipant <- nrow(f) 
f %>% 
    ## Reformat the data 
    gather(question, response, starts_with("Q")) %>% 
    filter(response != "") %>% 

    ## calculate the height of the bars 
    group_by(question) %>% 
    summarise(score = length(response)/nparticipant) %>% 

    ## Plot 
    ggplot(aes(x=question, y=score)) + 
    geom_bar(stat = "identity")

來源

2016-10-04 16:59:53 csiu

以下是使用dplyr包的ddply解決方案。

# I needed to increase number of participants to ensure it works in every case 
df = data.frame(Participant = seq(1:100), 
Q1_1 = sample(c("a", ""), 100, replace = T, prob = c(1/2, 1/2)), 
Q1_2 = sample(c("b", ""), 100, replace = T, prob = c(2/3, 1/3)), 
Q1_3 = sample(c("c", ""), 100, replace = T, prob = c(1/3, 2/3))) 
df$answer = paste0(df$Q1_1, df$Q1_2, df$Q1_3) 

summ = ddply(df, c("answer"), summarize, freq = length(answer)/nrow(df)) 

## Re-ordeing of factor levels summ$answer 
summ$answer <- factor(summ$answer, levels=c("", "a", "b", "c", "ab", "ac", "bc", "abc")) 

# Plot 
ggplot(summ, aes(answer, freq, fill = answer)) + geom_bar(stat = "identity") + theme_bw()

注：如果您有關於其他問題（「Q2_1」，「Q2_2」 ...）更多的列可能是更加複雜。在這種情況下，爲每個問題解釋數據可能是一個解決方案。

來源

2016-10-04 15:50:07 bVa

謝謝，更正。 – bVa

我想你想是這樣的（比例與堆積條形圖）：

Participant Q1_1 Q1_2 Q1_3 
1   1 a   c 
2   2 a a c 
3   3 c b c 
4   4   b d 

# ensure that all question columns have the same factor levels, ignore blanks 
for (i in 2:4) { 
    df[,i] <- factor(df[,i], levels = c(letters[1:4])) 
} 

tdf <- as.data.frame(sapply(df[2:4], function(x)table(x)/sum(table(x)))) 
tdf$choice <- rownames(tdf) 
tdf <- melt(tdf, id='choice') 

ggplot(tdf, aes(variable, value, fill=choice)) + 
     geom_bar(stat='identity') + 
     xlab('Questions') + 
     ylab('Proportion of Choice')

來源

2016-10-04 17:46:52

R選擇問卷數據ggplot

回答

相關問題