2015-09-04 99 views
-1

我有一個具有兩個奇數變量的數據幀。對於一個變量,每個單元存儲一個列表,其內容只是兩個數字的向量。對於其他變量,每個單元存儲8個數字的三維數組(即使只有兩個維度是必需的)。「unlist」存儲在變量中的列表

我想通過將奇數變量分解爲單獨的變量來簡化數據集。我想出瞭如何使用for循環打破所有數據,但這非常緩慢。我知道申請應該是一般較快,但我無法弄清楚我將如何翻譯這適用。是否有可能,還是有更好的方法來做到這一點?

for (i in 1:nrow(df)){ 
    if (length(df$coordinates.coordinates[[i]]>0)){ 
    df[i,"coordinates.lon"]<- df$coordinates.coordinates[[i]][1] 
    df[i,"coordinates.lat"]<- df$coordinates.coordinates[[i]][2] 
    } 
    if (length(df$place.bounding_box.coordinates[[i]]>0)){ 
    df[i,"place.bounding_box.a.lon"] <-df$place.bounding_box.coordinates[[i]][1,1,1] 
    df[i,"place.bounding_box.b.lon"] <-df$place.bounding_box.coordinates[[i]][1,2,1] 
    df[i,"place.bounding_box.c.lon"] <-df$place.bounding_box.coordinates[[i]][1,3,1] 
    df[i,"place.bounding_box.d.lon"] <-df$place.bounding_box.coordinates[[i]][1,4,1] 
    df[i,"place.bounding_box.a.lat"] <-df$place.bounding_box.coordinates[[i]][1,1,2] 
    df[i,"place.bounding_box.b.lat"] <-df$place.bounding_box.coordinates[[i]][1,2,2] 
    df[i,"place.bounding_box.c.lat"] <-df$place.bounding_box.coordinates[[i]][1,3,2] 
     df[i,"place.bounding_box.d.lat"] <-df$place.bounding_box.coordinates[[i]][1,4,2] 
     } 
} 

編輯 下面是一個例子數據框用一個案例(通過dput)

structure(list(coordinates.coordinates = list(c(112.088477, -7.227974 
)), place.bounding_box.coordinates = list(structure(c(112.044456, 
112.044456, 112.143242, 112.143242, -7.263067, -7.134563, -7.134563, 
-7.263067), .Dim = c(1L, 4L, 2L)))), .Names = c("coordinates.coordinates", 
"place.bounding_box.coordinates"), class = c("tbl_df", "data.frame" 
), row.names = c(NA, -1L)) 

萬一有幫助,這是失控,當你嘗試讀取的Twitter流中的數據格式數據使用jsonlite的stream_in函數(使用flatten = TRUE)

+0

您能否提供示例數據? – dd3

回答

0
library(dplyr) 

df = data_frame(
    coordinates.coordinates = 
    list(c(0, 1), c(2, 3)), 
    place.bounding_box.coordinates = 
    list(array(0, dim=c(1, 4, 2)), 
     array(1, dim=c(1, 4, 2)))) 

df %>% 
    rowwise %>% 
    do(with(., data_frame(
    longitude = coordinates.coordinates[1], 
    latitude = coordinates.coordinates[2]) %>% bind_cols(
     place.bounding_box.coordinates %>% 
     as.data.frame %>% 
     setNames(c(
      "place.bounding_box.a.lon", 
      "place.bounding_box.b.lon", 
      "place.bounding_box.c.lon", 
      "place.bounding_box.d.lon", 
      "place.bounding_box.a.lat", 
      "place.bounding_box.b.lat", 
      "place.bounding_box.c.lat", 
      "place.bounding_box.d.lat")))))