我想模擬一些R中缺失的數據,但遇到了麻煩。我已經創建了兩個變量(「前」和「後」),它們代表了同一個人治療前後的測量結果(即配對數據)。我已經能夠爲隨機(MCAR)完全丟失的數據做到這一點 - 見下文,但我無法弄清楚如何將它編碼爲隨機丟失(MAR)。對於MAR缺失數據,我想根據治療前觀察結果創建3個類別,這將決定缺失多少個治療後觀察結果。即如何在R中模擬MAR缺失數據?
對於預> 25,40%後失蹤
對於預> 21和≤25,30%後失蹤
對於預≤21,20%後失蹤
誰能幫幫忙? (我會非常感謝!)
感謝
set.seed(80122)
n <- 1000
# Simulate 1000 people with high pre-treatment (mean 28, sd 3) and normal (mean 18, sd 3) post-treatment. Correlation between paired data = 0.7.
data <- rmvnorm(n,mean=c(28,18),sigma=matrix(c(9,0.7*sqrt(81),0.7*sqrt(81),9),2,2)) # Covariance matrix
# Split into pre and post treatment and check correlation is what was specified
pre <- data[, 1]
post <- data[, 2]
cor.test(pre,post)
# Simulate MCAR
mcar <- 1 - rbinom(n, 1, 0.2) # Will create ~ 20% zero's which we'll convert to NA's
post_mcar <- post
post_mcar[mcar == 0] <- mcar[mcar==0] # Replace post data with random zero's from mcar vector
post_mcar[mcar == 0] <- NA # Change zero's to NAs