我有一個如下所示的數據庫。比較並獲得行之間的間隔交點
pos1<-c(5,15,25,40,80,5,18,22,38,84,5,16,50,92,31,50,20,30,50,70,27,50,60,50,90,20,40)
pos2<-c(10,17,30,42,90,10,20,24,42,87,10,19,52,100,40,70,25,32,60,90,30,60,71,60,100,25,50)
chr<-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2)
n<-c(25,65,78,56,35,78,58,98,14,25,65,85,98,74,20,36,48,98,52,69,21,47,53,10,12,37,82)
pop<-c("A","A","A","A","A","B","B","B","B","C","C","C","C","C","D","D","A","A","A","A","B","B","B","C","C","D","D")
data<-data.frame(pos1,pos2,chr,pop,n)
位置1和位置2設計用於每個字符和人口的間隔的開始和結束點。我的目的是要獲得流行音樂A,B和C(不是D)之間的哪個區間相交,哪些區間對於每個人羣是唯一的。
所以,獨特的間隔我將有一個結果data.frame類似如下:
pos1.u<-c(25,50,92,20,30,27,90)
pos2.u<-c(30,52,100,25,32,30,100)
chr.u<-c(1,1,1,2,2,2,2)
pop.u<-c("A","B","C","A","A","B","C")
n.u<-c(78,98,74,48,98,21,12)
data.u<-data.frame(pos1.u,pos2.u,chr.u,pop.u,n.u)
而對於那些3個種羣之間相交下面這樣的data.frame間隔:
pos1.c<-c(5,15,40,80,5,38,85,5,16,50,70,50,60,50)
pos2.c<-c(10,17,42,90,10,42,87,10,19,60,90,60,71,60)
chr.c<-c(1,1,1,1,1,1,1,1,1,2,2,2,2,2)
pop.c<-c("A","A","A","A","B","B","B","C","C","A","A","B","B","C")
n.c<-c(25,65,56,35,78,14,25,65,85,52,69,47,53,10)
data.c<-data.frame(pos1.c,pos2.c,chr.c,pop.c,n.c)
我不知道如何寫一個腳本來做到這一點,你能幫助我嗎?
你是什麼意思「這3個人口之間的相交」?最好我可以告訴,在A,B和C中出現的'pos1','pos2'和'chr'只有兩種組合:5,10和1以及50,60和2. – ulfelder
那些是具有完整相交的線段。但我對每個重疊的細分都感興趣。也許我應該使用重疊而不是交叉...抱歉。所以我想找到重疊的每個片段和每個不重疊的片段!感謝您的問題!我希望你可以進一步幫助我... – Cisco
而「重疊」,你的意思是,在'pos1'開始到'pos2'結束的序列的一部分也出現在特定的組合'chr'和'pop'中至少有一個序列具有相同的'chr'值,但'pop'值不同,對吧? – ulfelder