我有問題在邊緣R中創建MDS陰謀,以可視化實驗(白血病)和控制(健康捐助者)羣體的顏色。邊緣多彩MDS陰謀R
我用htseq文件作爲edgeR的輸入。每個文件由兩列組成 - gene_ID和讀取計數。 「A」代表白血病患者,「H」代表健康捐獻者。
這裏是我的代碼:
創建一個表:
samples <- matrix(c("A18.txt","experiment","blood_exp",
"A19.txt","experiment","blood_exp",
"A20.txt","experiment","blood_exp",
"A23.txt","experiment","blood_exp",
"A24.txt","experiment","blood_exp",
"A26.txt","experiment","blood_exp",
"A30.txt","experiment","blood_exp",
"A37.txt","experiment","blood_exp",
"H11.txt","control","blood_control",
"H12.txt","control","blood_control",
"H13.txt","control","blood_control",
"H15.txt","control","blood_control",
"H16.txt","control","blood_control",
"H17.txt","control","blood_control",
"H18.txt","control","blood_control",
"H19.txt","control","blood_control"),
nrow = 16, ncol = 3, byrow = TRUE, dimnames = list(c(1:16), c("library_name","condition","group_ALL_vs_control")))
samples <- as.data.frame (samples, row.names = NULL, optional = FALSE, stringAsFactors = default.stringAsFactors())
使用磨邊機功能,readDGE,在READS COUNT文件創建frou htseq數爲:
counts <- readDGE(samples$library_name, path = 'C:/Users/okbm4/Desktop/htseq_files', columns=c(1,2), group = samples$group_ALL_vs_control, header = FALSE)
colnames(counts) <- samples$library_name
過濾器弱表達和無信息(即amibigous)功能:
noint <- rownames(counts) %in% c('__no_feature','__ambiguous','__too_low_aQual','__not_aligned','__alignment_not_unique')
cpms <- cpm(counts)
keep <- rowSums (cpms > 1) >= 4 & !noint
counts <- counts[keep,]
創建DGElist對象
counts <- DGEList(counts=counts,group = samples$group_ALL_vs_control)
估計歸一化因子,這是對文庫大小
counts <- calcNormFactors(counts)
檢查使用MDS情節樣本之間的關係正常化。
pdf(file = 'HCB_ALL.pdf', width = 9, height = 6)
plotMDS(counts, labels = c('A18.txt','A19.txt','A20.txt','A23.txt','A24.txt','A26.txt','A30.txt','A37.txt','H11.txt','H12.txt','H13.txt','H15.txt','H16.txt','H17.txt','H18.txt','H19.txt'),
xlab = 'Dimension 1',
ylab = 'Dimension 2',
asp = 6/9,
cex = 0.8,
main = 'Multidimentional scaling plot')
par(cex.axis =0.6, cex.lab = 0.6, cex.main = 1)
我很樂意聽到任何建議。
請考慮是否所有這些代碼是真正必要的,以演示你想要實現的(顏色的一些觀點)。 –