2012-07-03 57 views
3

我正在處理腳本以創建用於分析的csv文件。運行腳本時,它爲OPA 5701提供1個csv文件,爲6561提供1個文件。這是腳本2個部分之間的唯一區別。將數據重新整形爲CSV

##Samplesheet for GS0005701 
rows<-unique(samples$Sample_Name) 
samplesheet<-rows 
opa.panels<-sort(unique(samples$Pool_ID)) 
for (i in 1:length(opa.panels)){ 
    samps<-samples[samples$Pool_ID == opa.panels[i],] 
    idx<-match(samps$Sample_Name,rows) 
    samplesheet<-cbind(samplesheet,samps$Sentrix_ID[idx],samps$Sentrix_Position[idx]) 
} 
colnames(samplesheet)[2:(length(opa.panels)*2+1)]<-c("SentrixBarcode_A","SentrixPosition_A","SentrixBarcode_B","SentrixPosition_B","SentrixBarcode_C","SentrixPosition_C","SentrixBarcode_D","SentrixPosition_D")[1:(length(opa.panels)*2)] 
colnames(samplesheet)[1]<-"Sample_Name" 
idx<-match(rows,samples$Sample_Name) 
samplesheet<-cbind(samplesheet,samples[idx,c("Sample_Group","NorTum","Sample")]) 
ss_header<-c("[Header]","Investigator Name,Sander","Project Name,HNPCC_NA_MYH","Experiment Name,OPA1+2+3+4","Date,5062012","[Manifests]") 
for (i in 1:length(opa.panels)) ss_header<-c(ss_header,paste(LETTERS[i],opa.panels[i],sep=",")) 
ss_header<-c(ss_header,"[Data]") 
writeLines(ss_header,"Samplesheet5701.csv") 
write.table(samplesheet,file="Samplesheet5701.csv",sep=",",row.names=FALSE,quote=FALSE,append=TRUE,na="") 

##Samplesheet for GS0006561-OPA 
rows2<-unique(samples2$Sample_Name) 
samplesheet2<-rows2 
opa.panels2<-sort(unique(samples2$Pool_ID)) 
for (j in 1:length(opa.panels2)){ 
    samps2<-samples2[samples2$Pool_ID == opa.panels2[j],] 
    idx2<-match(samps2$Sample_Name,rows2) 
    samplesheet2<-cbind(samplesheet2,samps2$Sentrix_ID[idx2],samps2$Sentrix_Position[idx2]) 
} 
colnames(samplesheet2)[2:(length(opa.panels)*2+1)]<-c("SentrixBarcode_A","SentrixPosition_A","SentrixBarcode_B","SentrixPosition_B","SentrixBarcode_C","SentrixPosition_C","SentrixBarcode_D","SentrixPosition_D")[1:(length(opa.panels)*2)] 
colnames(samplesheet2)[1]<-"Sample_Name" 
idx2<-match(rows2,samples2$Sample_Name) 
samplesheet2<-cbind(samplesheet2,samples2[idx2,c("Sample_Group","NorTum","Sample")]) 
ss_header<-c("[Header]","Investigator Name,Sander","Project Name,HNPCC_NA_MYH","Experiment Name,OPA1+2+3+4","Date,5062012","[Manifests]") 
for (j in 1:length(opa.panels2)) ss_header<-c(ss_header,paste(LETTERS[j],opa.panels2[j],sep=",")) 
ss_header<-c(ss_header,"[Data]") 
writeLines(ss_header,"samplesheet6561.csv") 
write.table(samplesheet2,file="Samplesheet6561.csv",sep=",",row.names=FALSE,quote=FALSE,append=TRUE,na="") 

的## Samplesheet GS0005701部分創建data.frame。 ## Samplesheet GS0006561創建一個matrix。使用相同的代碼和相同的輸入數據。

輸入的數據是這樣的:

Sample Data

對於複製粘貼:

Sample Sample_Name Sample_Group NorTum Sentrix_ID Sentrix_Position Pool_ID Folderdate 
1 00-04193 00-04193N HNPCC_UV N 1495421 R007_C012 GS0006564-OPA Exp060410 
2 00-04193 00-04193N HNPCC_UV N 1495447 R007_C012 GS0006562-OPA Exp060410 
3 00-04193 00-04193N HNPCC_UV N 1495447 R007_C006 GS0006561-OPA Exp060410 
4 00-04193 00-04193N HNPCC_UV N 1495421 R007_C006 GS0006563-OPA Exp060410 
5 00-04193 00-04193N HNPCC_UV N 1460498 R007_C005 GS0006561-OPA Exp060516 
6 00-04193 00-04193N HNPCC_UV N 1460498 R007_C012 GS0006564-OPA Exp060516 

我知道這是一個很難回答的問題,但我真的希望有人可以給我一個關於它可能如何創建data.frame和其他矩陣的提示。

非常感謝提前!

+1

你可以使用dput的輸出(頭部(樣本))替換數據的圖片(和損壞的*複製和粘貼*部分) – mnel

+2

太混亂了一個問題,你應該瞄準一個小的可再現的例子。回答data.frame vs矩陣問題這很可能是因爲一個輸出包含數字和字符值的混合,而另一個只包含數字或字符(後者更可能給出輸入) – Hansi

+0

感謝您的答案,我'我發現是什麼導致了這個問題,它與'match'語句和'idx <-match(行,樣本$ Sample_Code)'部分的交換有關係 – Sanshine

回答

1

這個問題的正確答案是索引部分的交換。

idx<-match(samps$Sample_Name,rows) 

改變爲:

idx<-match(rows,samps$Sample_Code) 

,以便行的長度是相同的Sample_Code的長度。