2017-07-06 21 views
1

我有與樣品名稱表「inputdf」在列「SampleFileName」隨機順序。如何停止tidyr蔓延排序列按字母順序

> colnames(inputdf) 
[1] "Dye/SamplePeak" "SampleFileName" "Marker"   "Allele"   "Size"   "Height"   
[7] "Area"   "DataPoint"  "flank"   "correction"  "start"   "end"   
[13] "control"  "iithreshold" "CAG"   

我使用tidyr從「高度」欄成績蔓延到單獨的列,在「SampleFileName」的值命名爲每一列。

library(tidyr) 
height <- spread(inputdf, key=SampleFileName, value=Height, fill = 0, convert = FALSE) #Extract heights into separate columns for each sample 

我的樣本不按字母順序列「SampleFileName」,我想保持他們的順序。但是,傳播會自動按字母順序排序。我會很感激你的幫助!

> colnames(height) 
[1] "Dye/SamplePeak"       "Marker"         
[3] "Allele"         "Size"         
[5] "Area"         "DataPoint"        
[7] "flank"         "correction"        
[9] "start"         "end"         
[11] "control"        "iithreshold"       
[13] "CAG"         "A01_MF20170522_FA_A01_2017-05-22_1.fsa" 
[15] "A01_MF20170623_FA_A01_2017-06-23_1.fsa" "A02_MF20170623_FA_A02_2017-06-23_1.fsa" 
[17] "A03_MF20170623_FA_A03_2017-06-23_1.fsa" "A05_MF20170623_FA_A05_2017-06-23_1.fsa" 
[19] "A06_MF20170623_FA_A06_2017-06-23_1.fsa" "A07_MF20170623_FA_A07_2017-06-23_1.fsa" 
[21] "A08_MF20170623_FA_A08_2017-06-23_1.fsa" "A09_MF20170623_FA_A09_2017-06-23_1.fsa" 
[23] "A10_MF20170623_FA_A10_2017-06-23_1.fsa" "A11_MF20170623_FA_A11_2017-06-23_1.fsa" 
[25] "A12_MF20170623_FA_A12_2017-06-23_1.fsa" "B01_MF20170623_FA_B01_2017-06-23_1.fsa" 
[27] "B02_MF20170623_FA_B02_2017-06-23_1.fsa" "B03_MF20170623_FA_B03_2017-06-23_1.fsa" 
[29] "B04_MF20170623_FA_B04_2017-06-23_1.fsa" "B05_MF20170623_FA_B05_2017-06-23_1.fsa" 
[31] "B06_MF20170623_FA_B06_2017-06-23_1.fsa" "B07_MF20170623_FA_B07_2017-06-23_1.fsa" 
[33] "B08_MF20170522_FA_B08_2017-05-22_1.fsa" "B08_MF20170623_FA_B08_2017-06-23_1.fsa" 
[35] "C01_MF20170623_FA_C01_2017-06-23_1.fsa" "C02_MF20170529_FA_C02_2017-05-30_1.fsa" 
[37] "C02_MF20170623_FA_C02_2017-06-23_1.fsa" "C05_MF20170623_FA_C05_2017-06-23_1.fsa" 
[39] "C07_MF20170623_FA_C07_2017-06-23_1.fsa" "C08_MF20170623_FA_C08_2017-06-23_1.fsa" 
[41] "C09_MF20170623_FA_C09_2017-06-23_1.fsa" "C10_MF20170623_FA_C10_2017-06-23_1.fsa" 
[43] "C11_MF20170623_FA_C11_2017-06-23_1.fsa" "C12_MF20170623_FA_C12_2017-06-23_1.fsa" 
[45] "D02_MF20170623_FA_D02_2017-06-23_1.fsa" "D03_MF20170623_FA_D03_2017-06-23_1.fsa" 
[47] "D04_MF20170623_FA_D04_2017-06-23_1.fsa" "D05_MF20170623_FA_D05_2017-06-23_1.fsa" 
[49] "D06_MF20170623_FA_D06_2017-06-23_1.fsa" "D08_MF20170623_FA_D08_2017-06-23_1.fsa" 
[51] "D10_MF20170623_FA_D10_2017-06-23_1.fsa" "D11_MF20170623_FA_D11_2017-06-23_1.fsa" 
[53] "D12_MF20170623_FA_D12_2017-06-23_1.fsa" 
+0

'tidyverse'通常對列順序很少關心,因爲變量通常以名稱引用。你可以用'select'事後重新排序(或不使用'spread')。 – Axeman

+0

謝謝你,你會不會介意這表明了選擇() – Mike

回答

1

你可以嘗試這樣的事情。

library(tidyr) 

# Get vector of current column names (excluding "SampleFileName" and "Height" as they will not exist in final dataset) and all of the SampleFileName values. 
cols <- c(colnames(inputdf)[!(colnames(inputdf) %in% c("SampleFileName","Height"))], unique(inputdf$SampleFileName)) 

# Spread the SampleFileName column 
height <- spread(inputdf, key=SampleFileName, value=Height, fill = 0, convert = FALSE) 

# Select the columns in the order they are listed in the cols vector 
height <- height[,cols] 
+0

感謝代碼,相信追加行應爲: 的cols < - 添加(的cols,colnames(高度)[!在%的cols(colnames(高度)%)) 不幸的是我收到以下錯誤: 錯誤'[.data.frame'(height,cols):undefined columns selected 另外,矢量列似乎按字母順序依次列出列,而不是原始列爲了 – Mike

+0

我覺得這個新的更新應該得到你所期待的那個的工作 –

+0

十分感謝 – Mike