2014-04-17 25 views
0

我需要在35行之間找到差異表達的基因(在微陣列中)。 30行'的名字以RAL開頭,5行'以ZI開頭。我想對30個RAL線和5個ZI線進行對比。由於我不想手動輸入所有150,我想使用makeContrast。makeContrast兩個不同的數據集之間

我的數據是這樣的:

dput(sampletype) 

structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 
5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 
10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 
14L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 17L, 17L, 18L, 18L, 18L, 
19L, 19L, 19L, 20L, 20L, 20L, 21L, 21L, 21L, 22L, 22L, 22L, 23L, 
23L, 23L, 24L, 24L, 24L, 25L, 25L, 25L, 26L, 26L, 26L, 27L, 27L, 
27L, 28L, 28L, 28L, 29L, 29L, 29L, 30L, 30L, 30L, 31L, 31L, 32L, 
32L, 32L, 33L, 33L, 33L, 34L, 34L, 34L, 35L, 35L, 35L), .Label = c("RAL307", 
"RAL820", "RAL705", "RAL765", "RAL852", "RAL799", "RAL301", "RAL427", 
"RAL437", "RAL315", "RAL357", "RAL304", "RAL391", "RAL313", "RAL486", 
"RAL380", "RAL859", "RAL786", "RAL399", "RAL358", "RAL360", "RAL517", 
"RAL639", "RAL732", "RAL379", "RAL555", "RAL324", "RAL774", "RAL42", 
"RAL181", "ZI50N", "ZI186N", "ZI357N", "ZI31N", "ZI197N"), class = "factor") 

design.matrix <- model.matrix(~ 0 + sample types) 

我怎樣才能得到對比度如 「RAL517-ZI50」, 「RAL852-ZI50」, 「RAL517-ZI42」, 「RAL852-ZI42」?

有反正我可以做到這一點嗎?

這些都是從我的sessionInfo():

> sessionInfo() 
R version 3.0.2 (2013-09-25) 
Platform: x86_64-apple-darwin10.8.0 (64-bit) 

locale: 
[1] C 

attached base packages: 
[1] parallel stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] gplots_2.12.1  reshape2_1.2.2  ggplot2_0.9.3.1 affy_1.38.1  vsn_3.28.0   Biobase_2.20.1  
[7] BiocGenerics_0.6.0 limma_3.16.8  

loaded via a namespace (and not attached): 
[1] BiocInstaller_1.10.4 KernSmooth_2.23-10 MASS_7.3-29   RColorBrewer_1.0-5 affyio_1.28.0   
[6] bitops_1.0-6   caTools_1.14   colorspace_1.2-4  dichromat_2.0-0  digest_0.6.3   
[11] gdata_2.13.2   grid_3.0.2   gtable_0.1.2   gtools_3.1.0   labeling_0.2   
[16] lattice_0.20-23  munsell_0.4.2   plyr_1.8    preprocessCore_1.22.0 proto_0.3-10   
[21] scales_0.2.3   stringr_0.6.2   tools_3.0.2   zlibbioc_1.6.0  

感謝

回答

1

,你有兩個類之間的分類比較的問題,我建議你閱讀Bioconductor的的LIMMA包的用戶指南,這是鑑定差異表達基因的流行包裝(http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf)。如果您使用單色芯片,您可以將注意力集中在第9.2節。

順便說一句,你必須創建一個兩級因素進行比較:

# build the design matrix 

library(limma) 

yourfactor <- c(rep("RAL", 30),rep("ZI", 5)) 
design <- model.matrix(~ 0 + yourfactor) 
colnames(design) <- gsub("yourfactor", "", colnames(design)) # to simplify the colnames of design 

# perform the comparison 


fit <- lmFit(data, design) # data is your gene expression matrix 
contrast.matrix <- makeContrasts(RAL-ZI, levels=design) 
fit2 <- contrasts.fit(fit, contrast.matrix) 
fit2 <- eBayes(fit2) 

# summarize the results of the linear model 
results <- topTable(fit2, number=nrow(data), adjust.method="BH") 

要小心的是,在你的因子表達基質和樣品標籤樣品以相同的順序。爲了避免這種問題,我建議您創建一個ExpressionSet對象(http://www.bioconductor.org/packages/release/bioc/html/Biobase.html),這對操作基因表達數據非常有用。

我希望這有幫助,

最好。

Matteo

相關問題