2016-05-06 24 views
1

這是一個Coursera課程,期望我們在沒有任何R經驗的情況下進行R編程,我真的很難理解,但沒有任何線索。我甚至檢查了基本的R教程,但仍然不知道。R二項式測試偏好,數據框

我們有一個CSV文件,內容:

  • 主題:30
  • 殘疾:0,1
  • 偏好:軌跡球,觸摸板

非殘疾人,執行二項式測試以查看他們對觸摸板的偏好是否與機會顯着不同。到最近的萬分之一(四位數),p值是多少?提示:運行一項二項式檢驗,比較非偏愛觸摸板的殘疾人總數與所有非殘疾人行數的總和。有兩種可能的偏好,觸摸板和軌跡球,機會概率爲1/2。不要糾正多重比較;考慮對數據的一個子集進行單個測試。

應該是解決辦法:

  • 首先,通過繪製人的喜好沒有殘疾獲得直覺:

    plot(df[df$Disability == "0",]$Pref) 
    
  • 其次,測試偏好觸摸板與對機會軌跡球,這將是沒有優先權的:

    binom.test(sum(df[df$Disability == "0",]$Pref == "touchpad"), 
          nrow(df[df$Disability == "0",]), p=1/2) 
    plot(df[df$Disability == "0",]$Pref) 
    

我明白,這應該給我們一個Disability = 0的偏好的視覺表示,但是dfs有一個錯誤,我不知道如何糾正它。有人可以幫忙嗎?

+2

如果您提供了您正在使用的數據,那會更好,因此我們可以重現您的代碼。試試'dput',或者在某處上傳這個csv併發佈一個鏈接。 –

+3

也請在問題中添加錯誤消息。 –

+0

感謝您的幫助!我只是想出了我需要用構建的xtab的名稱替換「df」。文件:https://www.dropbox.com/s/rd796wor7by5uky/DesignExperiments_R.Rproj?dl=0 https://www.dropbox.com/s/cig2u4d5vpkjma1/deviceprefs.csv?dl = 0 – testimo

回答

0

我模擬了隨機數據集與給定的特徵和一切工作只是罰款:

df <- data.frame(Subject = c("Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8", "Sub9", "Sub10", "Sub11", "Sub12", "Sub13", "Sub14", "Sub15", "Sub16", "Sub17", "Sub18", "Sub19", "Sub20", "Sub21", "Sub22",  "Sub23", "Sub24", "Sub25", "Sub26", "Sub27", "Sub28", "Sub29", "Sub30"), 
       Disability = c("0", "0", "1", "1", "1", "1", "0", "0", "0", "1", "1", "0", "0", "0", "0", "1", "0", "0", "1", "0", "0", "0", "0", "1", "1", "1", "0", "0", "1", "0"), 
       Pref = c("touchpad", "touchpad", "touchpad", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "touchpad", "trackball", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "touchpad", "touchpad", "touchpad", "touchpad", "trackball", "trackball")) 

給定的命令的結果是以下

binom.test(sum(df[df$Disability == "0",]$Pref == "touchpad"), 
      nrow(df[df$Disability == "0",]), p=1/2) 

    Exact binomial test 

data: sum(df[df$Disability == "0", ]$Pref == "touchpad") and nrow(df[df$Disability == "0", ]) 
number of successes = 8, number of trials = 18, p-value = 0.8145 
alternative hypothesis: true probability of success is not equal to 0.5 
95 percent confidence interval: 
0.2153015 0.6924283 
sample estimates: 
probability of success 
      0.4444444 

編輯

爲了將相同的測試應用於真實數據(鏈接到評論中給出的文件),第一步應當由命令讀出存儲在實際數據幀中的值來替換:

df <- read.csv("deviceprefs-1.csv") 

另外,給出的命令執行二項式檢驗工作得很好與真實數據組。

+0

謝謝你嘗試@ vincent-guillemot我給出了答案p值= 0.8145,測試表示這是不正確的。 – testimo

+1

我想你誤解了我的答案:當我說「模擬」時,這意味着我隨機生成了一些數據,所以p值與您的數據不符。 –