使用數據框中的值作爲數組索引

我查看了StackOverflow上的以前的問題，但還沒有找到適用於我遇到的問題的解決方案。使用數據框中的值作爲數組索引

基本上，我有一個數據幀，我們會打電話給df，看起來像這樣：

source destination year ship count 
     1   1415  1  6  0 
     1   1415  2  6  0 
     1   1415  3  6  0 
     1   1415  4  6  0 
     1   1415  5  6  0 
     1   1415  6  6  0

可複製的代碼，你應該在這裏需要它：

df <- structure(list(source = c(1L, 1L, 1L, 1L, 1L, 1L), destination = 
c(1415, 1415, 1415, 1415, 1415, 1415), year = 1:6, ship = c(6, 
6, 6, 6, 6, 6), count = c(0, 0, 0, 0, 0, 0)), .Names = c("source", 
"destination", "year", "ship", "count"), class = "data.frame", 
row.names = c(NA, 6L))

我也有一個四維陣列我們會打電話給m1。實質上，df的前四列中的每一列對應於m1的四個維度中的每一個 - 基本上是索引。正如您現在可能猜到的那樣，df的第五列對應於實際存儲在m1中的值。

因此，例如，df$count[3] <- m1[1,1415,3,6]。

目前，整個count列是空的，我想填寫它。如果這是一個小任務，我會用慢而笨的方法來做，並使用for循環，但是問題是df有大約300,000,000行，並且m1的尺寸大約是3900×3900×35×7。因此，以下方法在運行一整天后只能通過5％的行：

for(line in 1:nrow(df)){ 
    print(line/nrow(backcastdf)) 
    df$count[line] <- m1[df$source[line], df$destination[line], df$year[line], df$ship[line]] 
}

有關如何以更快的方式做到這一點的任何想法？

來源

2017-10-10 Anthony Sardain

也許你可以使用'purrr：地圖（）'？ – Jeremy

我不熟悉'purrr'軟件包，所以我不得不查看它並回復你。 –

據我所知你的問題，你只是尋找矩陣索引。

請考慮以下簡化示例。

首先，你的array（有4個維度）。

dim1 <- 2; dim2 <- 4; dim3 <- 2; dim4 <- 2 
x <- dim1 * dim2 * dim3 * dim4 

set.seed(1) 
M <- `dim<-`(sample(x), list(dim1, dim2, dim3, dim4)) 
M 
## , , 1, 1 
## 
##  [,1] [,2] [,3] [,4] 
## [1,] 9 18 6 29 
## [2,] 12 27 25 17 
## 
## , , 2, 1 
## 
##  [,1] [,2] [,3] [,4] 
## [1,] 16 5 14 20 
## [2,] 2 4 8 32 
## 
## , , 1, 2 
## 
##  [,1] [,2] [,3] [,4] 
## [1,] 31 28 24 7 
## [2,] 15 11 3 23 
## 
## , , 2, 2 
## 
##  [,1] [,2] [,3] [,4] 
## [1,] 13 1 21 30 
## [2,] 19 26 22 10 
##

其次，您的data.frame具有感興趣的指標。

mydf <- data.frame(source = c(1, 1, 2, 2), 
        destination = c(1, 1, 2, 3), 
        year = c(1, 2, 1, 2), 
        ship = c(1, 1, 2, 1), 
        count = 0) 
mydf 
## source destination year ship count 
## 1  1   1 1 1  0 
## 2  1   1 2 1  0 
## 3  2   2 1 2  0 
## 4  2   3 2 1  0

三，提取物：

out <- M[as.matrix(mydf[1:4])] 
out 
# [1] 9 16 11 8

四，比較：

M[1, 1, 1, 1] 
# [1] 9 
M[1, 1, 2, 1] 
# [1] 16 
M[2, 2, 1, 2] 
# [1] 11 
M[2, 3, 2, 1] 
# [1] 8

來源

2017-10-10 17:24:20 A5C1D2H2I1M1N2O1R2T1

哦，男人，真的那麼簡單嗎？等等，讓我檢查一下我的數據，然後我會回覆你。 –

剛剛檢查 - 完美的作品，只花了大約一分鐘。 –

使用數據框中的值作爲數組索引

回答

相關問題