基於R中的日期/時間範圍加入數據

我有一個文件（位置）具有x，y座標和日期/時間標識。我想從具有「相似」日期/時間變量和協變量（溫度和風速）的第二個表格（天氣）獲取信息。訣竅是兩個表中的日期/時間不完全相同。我想從位置數據中選擇最接近的天氣數據。我知道我需要做一些循環和那個。基於R中的日期/時間範圍加入數據

Example location example weather x y date/time date/time temp wind 1 3 01/02/2003 18:00 01/01/2003 13:00 12 15 2 3 01/02/2003 19:00 01/02/2003 16:34 10 16 3 4 01/03/2003 23:00 01/02/2003 20:55 14 22 2 5 01/04/2003 02:00 01/02/2003 21:33 14 22 01/03/2003 00:22 13 19 01/03/2003 14:55 12 12 01/03/2003 18:00 10 12 01/03/2003 23:44 2 33 01/04/2003 01:55 6 22

所以最終的輸出將與正確的「最佳」匹配的氣象數據中的位置數據

x y datetime datetime temp wind 1 3 01/02/2003 18:00 ---- 01/02/2003 16:34 10 16 2 3 01/02/2003 19:00 ---- 01/02/2003 20:55 14 22 3 4 01/03/2003 23:00 ---- 01/03/2003 00:22 13 19 2 5 01/04/2003 02:00 ---- 01/04/2003 01:55 6 22

任何建議，從哪裏開始的表？我試圖在R

來源

2011-03-24 Kerry

我希望你能在沒有傳統循環的情況下解決這個問題。來自申請家庭的一個聲明加上一個約（）可能會有好處。建議從哪裏開始？給我們一些適當的數據來處理。而不是粘貼文本，在R中構建數據，然後在這裏粘貼dput（）的結果，以便我們可以輕鬆地重建數據並編寫一些可以測試的代碼。 – Andrie 2011-03-24 21:47:35

爲什麼位置的第3行加入天氣第5行？是不是最接近01/03/2003 23:00的天氣第8排？ – 2011-03-24 21:57:23

@ Matthew - 你是對的，這是一個在飛行中產生數據的錯誤 – Kerry 2011-03-24 22:52:47

我需要把這些數據作爲數據和時間分開，然後粘貼和格式

location$dt.time <- as.POSIXct(paste(location$date, location$time), 
           format="%m/%d/%Y %H:%M")

與同爲weather

然後在location的date.time每個值，找到條目weather具有最低絕對值的時間差：

sapply(location$dt.time, function(x) which.min(abs(difftime(x, weather$dt.time)))) 
# [1] 2 3 8 9 
cbind(location, weather[ sapply(location$dt.time, 
         function(x) which.min(abs(difftime(x, weather$dt.time)))), ]) 

    x y  date time    dt.time  date time temp wind    dt.time 
2 1 3 01/02/2003 18:00 2003-01-02 18:00:00 01/02/2003 16:34 10 16 2003-01-02 16:34:00 
3 2 3 01/02/2003 19:00 2003-01-02 19:00:00 01/02/2003 20:55 14 22 2003-01-02 20:55:00 
8 3 4 01/03/2003 23:00 2003-01-03 23:00:00 01/03/2003 23:44 2 33 2003-01-03 23:44:00 
9 2 5 01/04/2003 02:00 2003-01-04 02:00:00 01/04/2003 01:55 6 22 2003-01-04 01:55:00 

cbind(location, weather[ 
        sapply(location$dt.time, 
        function(x) which.min(abs(difftime(x, weather$dt.time)))), ])[ #pick columns 
          c(1,2,5,8,9,10)] 

    x y    dt.time temp wind   dt.time.1 
2 1 3 2003-01-02 18:00:00 10 16 2003-01-02 16:34:00 
3 2 3 2003-01-02 19:00:00 14 22 2003-01-02 20:55:00 
8 3 4 2003-01-03 23:00:00 2 33 2003-01-03 23:44:00 
9 2 5 2003-01-04 02:00:00 6 22 2003-01-04 01:55:00

我的回答似乎有點不同比其他讀者已經質疑您的手工匹配能力。

來源

2011-03-24 23:12:33

哈哈哈！人爲錯誤！從而需要電腦的動畫過程。 – Kerry 2011-03-24 23:47:31

一個快捷方式可能是使用data.table。如果你創建了兩個data.table的X和Y，都與鑰匙，那麼語法是：

X[Y,roll=TRUE]

我們稱之爲一個滾動加入因爲我們推出的X普遍觀察着該行匹配Y.請參閱？data.table中的示例和簡介小插曲。

另一種方法做到這一點是動物園包有locf（最後觀察結轉），也可能是其他包。

我不確定你的意思是最接近的位置或時間。如果位置，並且該位置是x，y座標，那麼您將需要在2D空間中進行一些距離度量。 data.table只能做到「最接近」的單變量按時間。雖然第二次讀到你的問題，但你確實認爲你的意思是最接近的。

編輯：現在看到示例數據。 data.table不會一步做到這一點，因爲儘管它可以向前或向後滾動，但它不會滾到最近。你可以用一個額外的步驟來做到這一點，使用哪一個= TRUE，然後測試流行之後的那個實際上是否更接近。

來源

2011-03-24 21:49:27

謝謝，我會研究一下，看看它做得更好還是更快，因爲到目前爲止，這是我所做的一些東西，我看到了MySQL的腳本爲**（我在1：nrow（LOC））{ \t指數= which.min（ABS（LOC $ DateTime的[I] - 天氣$ DATETIME））祿$ WndSp [i] = weather $ WndSp [index] } ** – Kerry 2011-03-24 22:47:01

基於R中的日期/時間範圍加入數據

回答

相關問題