安排數據行中的R

我有相應疾病OMIM基因列表（約15000個基因），看起來像這樣：安排數據行中的R

SLC6A8,CRTR,CCDS1 Cerebral creatine deficiency syndrome 1, 300352 (3) 
BCAP31,BAP31,DXS1357E,DDCH Deafness, dystonia, and cerebral hypomyelination 
ABCD1,ALD,AMN Adrenoleukodystrophy, 300100 (3), X-linked recessive 
PLXNB3,PLXN6 NA

對於某些疾病，我們與疾病相關的多個基因名。我想這個組織，所以我必須每行只有一個genename和相關疾病：

SLC6A8 Cerebral creatine deficiency syndrome 1, 300352 (3) 
CRTR Cerebral creatine deficiency syndrome 1, 300352 (3) 
CCDS1 Cerebral creatine deficiency syndrome 1, 300352 (3)

難道這在R上做了什麼？

來源

2017-04-13 VasoGene

*「？難道這在R上做」 *的data.frame - 最有可能的，但你到目前爲止嘗試過什麼？這不是一個代碼寫入服務。 – nrussell

我一直在使用R的幾件事情，但在這種情況下，我不知道我到底需要看什麼。我只想提示，不一定是代碼！ – VasoGene

不完全確定你有什麼樣的數據結構。這裏有一個快速的解決方案，希望對您有所幫助在找什麼：

splitFn <- function(x) expand.grid(df[x,"a"] %>% as.character %>% strsplit(., ",") %>% unlist, df[x, "b"]) 
ldply(1:nrow(df), splitFn) 

     Var1            Var2 
1 SLC6A8 Cerebral creatine deficiency syndrome 1, 300352(3) 
2  CRTR Cerebral creatine deficiency syndrome 1, 300352(3) 
3  CCDS1 Cerebral creatine deficiency syndrome 1, 300352(3) 
4 BCAP31 Deafness, dystonia, and cerebral hypomyelination 
5  BAP31 Deafness, dystonia, and cerebral hypomyelination 
6 DXS1357E Deafness, dystonia, and cerebral hypomyelination 
7  DDCH Deafness, dystonia, and cerebral hypomyelination 
8  ABCD1 Adrenoleukodystrophy, 300100(3), X-linked recessive 
9  ALD Adrenoleukodystrophy, 300100(3), X-linked recessive 
10  AMN Adrenoleukodystrophy, 300100(3), X-linked recessive 
11 PLXNB3            <NA> 
12 PLXN6            <NA>

我會用

df <- structure(list(a = structure(c(4L, 2L, 1L, 3L), .Label = c("ABCD1,ALD,AMN", 
"BCAP31,BAP31,DXS1357E,DDCH", "PLXNB3,PLXN6", "SLC6A8,CRTR,CCDS1" 
), class = "factor"), b = structure(c(1L, 3L, 2L, NA), .Label = c(" Cerebral 
creatine deficiency syndrome 1, 300352(3)", 
"Adrenoleukodystrophy, 300100(3), X-linked recessive", "Deafness, dystonia, and cerebral hypomyelination" 
), class = "factor")), .Names = c("a", "b"), row.names = c(NA, 
-4L), class = "data.frame")```

來源

2017-04-13 10:20:28

安排數據行中的R

回答

相關問題