具有廣泛的數據的數據幀上卡方檢驗

我有一個看起來像這樣的數據：具有廣泛的數據的數據幀上卡方檢驗

ID gamesAlone gamesWithOthers gamesRemotely tvAlone tvWithOthers tvRemotely 
1 1             1 
2        1      1 
3        1    1 
4        1    1 
5        1      1 
6        1    1 
7        1    1 
8    1          1 
9 1                 1

我想代碼，可以做以下兩件事情：

首先，變換這像這樣整齊的列聯表：

 Alone WithOthers Remotely 
games 2  1   6 
tv  4  4   1

其次，使用卡方，看看這些活動（遊戲v電視）在他們的社會背景不同。

這是代碼來生成數據幀：

data<-data.frame(ID=c(1,2,3,4,5,6,7,8,9), 
      gamesAlone=c(1,NA,NA,NA,NA,NA,NA,NA,1), 
      gamesWithOthers=c(NA,NA,NA,NA,NA,NA,NA,1,NA), 
      gamesRemotely=c(NA,1,1,1,1,1,1,NA,NA), 
      tvAlone=c(NA,NA,1,1,NA,1,1,NA,NA), 
      tvWithOthers=c(1,1,NA,NA,1,NA,NA,1,NA), 
      tvRemotely=c(NA,NA,NA,NA,NA,NA,NA,NA,1))

來源

2017-08-08 mob

略去第一列ID（[-1]），然後取每個列的總和（colSums），而除去NA值（na.rm=TRUE），並將得到的長度爲6的矢量放入具有2行的矩陣中。如果需要，還可以相應地標註矩陣尺寸（參數爲dimnames）：

m <- matrix(
    colSums(data[-1], na.rm=T), 
    nrow=2, byrow=T, 
    dimnames = list(c("games", "tv"), c("alone", "withOthers", "remotely")) 
) 
m 
#  alone withOthers remotely 
# games  2   1  6 
# tv  4   4  1 
chisq.test(m) 
# 
# Pearson's Chi-squared test 
# 
# data: m 
# X-squared = 6.0381, df = 2, p-value = 0.04885

來源

2017-08-08 07:45:30 lukeA

這將讓你在應急表中，你給的形式。建議：請撥打data1而不是data以避免混淆。

library(dplyr) 
library(tidyr) 
data1_table <- data1 %>% 
    gather(key, value, -ID) %>% 
    mutate(activity = ifelse(grepl("^tv", key), substring(key, 1, 2), substring(key, 1, 5)), 
     context = ifelse(grepl("^tv", key), substring(key, 3), substring(key, 6))) %>% 
    group_by(activity, context) %>% 
    summarise(n = sum(value, na.rm = TRUE)) %>% 
    ungroup() %>% 
    spread(context, n) 

# A tibble: 2 x 4 
    activity Alone Remotely WithOthers 
* <chr> <dbl> <dbl>  <dbl> 
1 games  2  6   1 
2  tv  4  1   4

對於卡方：它取決於您想要比較的內容，我假設您的實際數據具有更高的計數。你可以管一大堆進入chisq.test這樣的，但我不認爲這是非常豐富：

data1_table %>% 
    select(2:4) %>% 
    chisq.test()

來源

2017-08-08 06:02:53 neilfws

具有廣泛的數據的數據幀上卡方檢驗

回答

相關問題