如何在R中結合兩個不同分辨率的有序系列？

我有一些鑽孔地質數據，按表面深度排序，總深度爲一些。有幾套我希望合併成一個，每個都有不同的分辨率。最高分辨率數據集具有所需的輸出分辨率（它也具有均勻分佈的深度，而其他分辨率的數據則不具備）。我有很多這些要管理，所以手動電子表格編輯將花費太長時間。如何在R中結合兩個不同分辨率的有序系列？

例如，下面是一些高分辨率數據中的一個選擇的深度範圍（大約151--152）：

data <- 
structure(list(DEPTH = c(150.876, 151.0284, 151.1808, 151.3332, 
151.4856, 151.638, 151.7904, 151.9428, 152.0952, 152.2476), DT = c(435.6977, 
437.6732, 441.4934, 444.6542, 445.771, 444.4603, 443.5679, 444.5042, 
447.3567, 450.4373), GR = c(13.8393, 14.549, 15.7866, 16.9114, 
18.4841, 18.8695, 17.7494, 16.7178, 12.8839, 11.7309)), .Names = c("DEPTH", 
"DT", "GR"), row.names = c(NA, -10L), class = "data.frame")

（完整的日誌數據文件大得多，所以我不我知道如何將它設置在這裏供您使用，取而代之的是我在下一個數據集中取了一部分匹配間隔; analyses）

還有一些較低分辨率的離散數值數據，其中深度爲範圍不等於上述數據logs。該數據表示在特定的深度範圍內的給定長度的採樣間隔，並沿給定的範圍內不發生變化：

analyses <- 
structure(list(from = c(151L, 198L, 284L, 480L), to = c(151.1, 
198.1, 284.1, 480.1), TC = c(1.276476312, 1.383553608, 1.46771308, 
1.125049954), DEN = c(1.842555733, 1.911724824, 1.997592565, 
NA), PORO = c(50.21947697, 44.26392579, 39.31309757, NA)), .Names = c("from", 
"to", "TC", "DEN", "PORO"), class = "data.frame", row.names = c(NA, 
-4L))

並與在不相等的深度數據的一些清晰度較低分類數據範圍：

units <- 
structure(list(from = c(0, 100, 450, 535, 617.89), to = c(100, 
450, 535, 617.89, 619.25), strat = structure(c(5L, 1L, 2L, 3L, 
4L), .Label = c("Formation A", "Formation B", 
"Group C", "Group D", "Unassigned"), class = "factor")), .Names = c("from", 
"to", "strat"), class = "data.frame", row.names = c(NA, -5L))

預期結果是第一個數據集logs的分辨率數據，第二個和第三個數據合併。在這種情況下，這將導致該數據幀：

DEPTH DT GR TC DEN PORO Unit 
150.8760 435.69 13.83 NA NA NA Formation A 
151.0284 437.67 14.54 1.27 1.84 50.21 Formation A 
151.1808 441.49 15.78 NA NA NA Formation A 
151.3332 444.65 16.91 NA NA NA Formation A 
151.4856 445.77 18.48 NA NA NA Formation A 
151.6380 444.46 18.86 NA NA NA Formation A 
151.7904 443.56 17.74 NA NA NA Formation A 
151.9428 444.50 16.71 NA NA NA Formation A 
152.0952 447.35 12.88 NA NA NA Formation A 
152.2476 450.43 11.73 NA NA NA Formation A

我試圖合併的數據幀，然後使用na.approx來填補空白，但問題是，很多在logs變量具有的NaN或我不想插值的NA - 他們需要保持爲NA。

來源

2013-06-13 a different ben

請包括預期的結果。 – Roland

是的，應該在第一時間把這個。我現在有了。 –

您可以使用merge或sqldf來加入您的數據框架。

library(sqldf) 

# If you know that each depth (in the first data.frame) 
# is in exactly one interval (in the second and third data.frames) 
sqldf(" 
    SELECT * 
    FROM data A, analyses B, units C 
    WHERE B.[from] <= A.DEPTH AND A.DEPTH < B.[to] -- Need to quote some of the column names 
    AND C.[from] <= A.DEPTH AND A.DEPTH < C.[to] 
") 

# If each depth (in the first data.frame) 
# is in at most one interval (in the second and third data.frames) 
sqldf(" 
    SELECT * 
    FROM data A 
    LEFT JOIN analyses B ON B.[from] <= A.DEPTH AND A.DEPTH < B.[to] 
    LEFT JOIN units C ON C.[from] <= A.DEPTH AND A.DEPTH < C.[to] 
    ORDER BY DEPTH 
")

來源

2013-06-13 17:12:31

這看起來非常有用，但我在R 2.14上，需要2.15的依賴關係，所以還沒有嘗試過。將嘗試升級...並讓你知道。 –

是的，那第二個查詢做到了。非常好，謝謝。以前從未加入過一個範圍。 –

如何在R中結合兩個不同分辨率的有序系列？

回答

相關問題