我有一個從bedGraph文件導入到GRanges對象的全基因組ChIP-seq信號。我想繪製覆蓋所有峯值的固定寬度區間的平均信號。我怎樣才能將信號提取到數字矢量中,以便我可以對其進行平均?R中有幾個區間的平均信號
通過舉例的方式考慮:
library(GenomicRanges)
set.seed(1)
signal <- GRanges(
seqnames = Rle(c("chr1"), c(10)),
ranges = IRanges(1:10*10, end = 1:10*10+5),
score = runif(10))
intervals <- GRanges(
seqnames = Rle(c("chr1"), c(5)),
ranges = IRanges(1:5*20 + floor(runif(5)*4), width = 10))
所以信號的樣子:
GRanges with 10 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <numeric>
[1] chr1 [ 10, 15] * | 0.2655086631421
[2] chr1 [ 20, 25] * | 0.37212389963679
[3] chr1 [ 30, 35] * | 0.572853363351896
[4] chr1 [ 40, 45] * | 0.908207789994776
[5] chr1 [ 50, 55] * | 0.201681931037456
[6] chr1 [ 60, 65] * | 0.898389684967697
[7] chr1 [ 70, 75] * | 0.944675268605351
[8] chr1 [ 80, 85] * | 0.660797792486846
[9] chr1 [ 90, 95] * | 0.62911404389888
[10] chr1 [100, 105] * | 0.0617862704675645
---
seqlengths:
chr1
NA
和間隔的樣子:
GRanges with 5 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr1 [ 20, 29] *
[2] chr1 [ 40, 49] *
[3] chr1 [ 62, 71] *
[4] chr1 [ 81, 90] *
[5] chr1 [103, 112] *
---
seqlengths:
chr1
NA
所以我想以平均矢量:
Rle(c(0.372, 0), c(6, 4)) # [ 20, 29]
Rle(c(0.908, 0), c(6, 4)) # [ 40, 49]
Rle(c(0.898, 0, 0.945), c(4, 4, 2)) # [ 62, 71]
Rle(c(0.661, 0, 0.629), c(5, 4, 1)) # [ 81, 90]
Rle(c(0.061, 0), c(3, 7)) # [103,112]
我該如何做到這一點,而無需for循環和大量繁瑣易錯的區間算術?我希望GenomicRanges軟件包能夠包含這種功能,但我在手冊中看不到它。我一直在嘗試使用subsetByOverlaps,但這似乎並沒有將信號分數帶入結果中,也沒有幫助提取上面的Rle向量。