2017-01-02 63 views
2

祝大家新年快樂基於時間戳R的快速數據填充

回到我的問題:

我有兩個數據集。

Dataset 1

Time   Name   Value 
    6/1/2016 9:39 ABCD IS Equity 11.01 
    6/1/2016 9:44 ABCD IS Equity 11.05 
    6/1/2016 9:46 ABCD IS Equity 11.01 
    6/1/2016 9:58 ABCD IS Equity 11.01 
    6/1/2016 10:10 ABCD IS Equity 11.01 
    6/1/2016 10:13 ABCD IS Equity 11.01 
    6/1/2016 10:33 ABCD IS Equity 11.02 
    6/1/2016 10:42 ABCD IS Equity 11.02 
    6/1/2016 10:52 ABCD IS Equity 11.02 
    6/1/2016 10:56 ABCD IS Equity 11.06 
    6/1/2016 11:14 ABCD IS Equity 11.02 
    6/1/2016 11:25 ABCD IS Equity 11.03 
    6/1/2016 11:26 ABCD IS Equity 11.03 
    6/1/2016 11:29 ABCD IS Equity 11.03 
    6/1/2016 11:30 ABCD IS Equity 11.03 
    6/1/2016 11:40 ABCD IS Equity 11.03 
    6/1/2016 11:40 ABCD IS Equity 11.01 
    6/1/2016 11:44 ABCD IS Equity 11.01 
    6/1/2016 12:04 ABCD IS Equity 11.01 

Dataset 2

Time2   Name2   Value2 
6/1/2016 9:42 ABCD IS Equity 123 
6/1/2016 9:45 ABCD IS Equity 124 
6/1/2016 9:45 ABCD IS Equity 125 
6/1/2016 10:00 ABCD IS Equity 126 
6/1/2016 10:14 ABCD IS Equity 127 
6/1/2016 10:14 ABCD IS Equity 128 
6/1/2016 10:14 ABCD IS Equity 129 
6/1/2016 10:41 ABCD IS Equity 130 
6/1/2016 10:45 ABCD IS Equity 131 
6/1/2016 10:56 ABCD IS Equity 132 
6/1/2016 10:58 ABCD IS Equity 133 
6/1/2016 11:26 ABCD IS Equity 134 
6/1/2016 11:27 ABCD IS Equity 135 
6/1/2016 11:30 ABCD IS Equity 136 
6/1/2016 11:32 ABCD IS Equity 137 
6/1/2016 11:40 ABCD IS Equity 138 
6/1/2016 11:42 ABCD IS Equity 139 
6/1/2016 11:45 ABCD IS Equity 140 
6/1/2016 12:05 ABCD IS Equity 141 

現在,我想創建Dataset 1一個New列,其將從Dataset2Value2基於對各行條件Dataset2$Time2 > Dataset1$Time填充值Dataset 1

下面是示例output:從Value2

Time   Name   Value New 
6/1/2016 9:39 ABCD IS Equity 11.01 123 
6/1/2016 9:44 ABCD IS Equity 11.05 124 
6/1/2016 9:46 ABCD IS Equity 11.01 126 
6/1/2016 9:58 ABCD IS Equity 11.01 126 
6/1/2016 10:10 ABCD IS Equity 11.01 127 
6/1/2016 10:13 ABCD IS Equity 11.01 127 
6/1/2016 10:33 ABCD IS Equity 11.02 130 
6/1/2016 10:42 ABCD IS Equity 11.02 131 
6/1/2016 10:52 ABCD IS Equity 11.02 132 
6/1/2016 10:56 ABCD IS Equity 11.06 133 
6/1/2016 11:14 ABCD IS Equity 11.02 134 
6/1/2016 11:25 ABCD IS Equity 11.03 134 
6/1/2016 11:26 ABCD IS Equity 11.03 135 
6/1/2016 11:29 ABCD IS Equity 11.03 136 
6/1/2016 11:30 ABCD IS Equity 11.03 137 
6/1/2016 11:40 ABCD IS Equity 11.03 139 
6/1/2016 11:40 ABCD IS Equity 11.01 139 
6/1/2016 11:44 ABCD IS Equity 11.01 140 
6/1/2016 12:04 ABCD IS Equity 11.01 141 

相同值的基礎上,匹配條件的不同Dataset1行可能填充。

Soln。我曾嘗試過:

我試過使用簡單的for循環[1: nrow(Dataset1)]來匹配每行Dataset2。但是我有一個很大的數據集,需要花費很長時間。我正在尋找更快的方式 - 它可以跳過使用for循環。

任何幫助/建議,將不勝感激。

+0

我們可以使用'data.table'即'setDT(DF1 )[df2,Value:= Value2,on =。(Name,Time2> Time1)]' – akrun

+0

說如果我有另一個名爲'Zone'的公共列。 ('Name',Zone,Time2> Time1)' – Zico

+0

是的,你可以做到這一點 – akrun

回答

1

一個可能的選擇是findIntervalbase R

df2$New <- df2$Value2[findInterval(df1$Time, df2$Time2)+1] 

注:我們假設 '時間', '時間2' 是POSIXct

+1

'timestamp'需要在應用'findInterval'之前進行排序。工作正常。確實是gr8解決方案。謝謝。 – Zico