2016-12-15 118 views
6

我想使用networkD3可視化一些深度嵌套的數據。在發送到radialNetwork之前,我無法弄清楚如何將數據轉換爲正確的格式。返回嵌套級別和值的嵌套列表

下面是一些示例數據:

level <- c(1, 2, 3, 4, 4, 3, 4, 4, 1, 2, 3) 
value <- letters[1:11] 

其中level表示巢的水平,並且value是節點的名稱。通過使用這兩個向量,我需要將數據放到以下格式:

my_list <- list(
    name = "root", 
    children = list(
    list(
     name = value[1], ## a 
     children = list(list(
     name = value[2], ## b 
     children = list(list(
      name = value[3], ## c 
      children = list(
      list(name = value[4]), ## d 
      list(name = value[5]) ## e 
     ) 
     ), 
     list(
      name = value[6], ## f 
      children = list(
      list(name = value[7]), ## g 
      list(name = value[8]) ## h 
     ) 
     )) 
    )) 
    ), 
    list(
     name = value[9], ## i 
     children = list(list(
     name = value[10], ## j 
     children = list(list(
      name = value[11] ## k 
     )) 
    )) 
    ) 
) 
) 

這裏是deparsed對象:

> dput(my_list) 
# structure(list(name = "root", 
#    children = list(
#     structure(list(
#     name = "a", 
#     children = list(structure(
#      list(name = "b", 
#       children = list(
#        structure(list(
#        name = "c", children = list(
#         structure(list(name = "d"), .Names = "name"), 
#         structure(list(name = "e"), .Names = "name") 
#        ) 
#       ), .Names = c("name", 
#           "children")), structure(list(
#            name = "f", children = list(
#            structure(list(name = "g"), .Names = "name"), 
#            structure(list(name = "h"), .Names = "name") 
#           ) 
#           ), .Names = c("name", 
#               "children")) 
#       )), .Names = c("name", "children") 
#     )) 
#     ), .Names = c("name", 
#        "children")), structure(list(
#         name = "i", children = list(structure(
#         list(name = "j", children = list(structure(
#          list(name = "k"), .Names = "name" 
#         ))), .Names = c("name", 
#             "children") 
#         )) 
#        ), .Names = c("name", "children")) 
#    )), 
#   .Names = c("name", 
#      "children")) 

然後我就可以把它傳遞給最終的繪圖功能:

library(networkD3) 
radialNetwork(List = my_list) 

輸出將類似於此:

enter image description here


問題:如何創建嵌套列表?

注意:正如@ zx8754所指出的那樣,這個SO post已經有一個解決方案,但這需要data.frame作爲輸入。由於我的level不一致,我沒有看到簡單的方法將其轉換爲data.frame

+0

@ zx8754增加了'dput(my_list)'。另外,輸入數據不是'data.frame',並且將其放入'data.frame'中並不容易,因爲這些級別不一致。這就是爲什麼我標記'遞歸'並認爲它可能是方向。不過,如果我錯了,請糾正我。 – Boxuan

+1

我們需要一個遞歸函數,它將採用數據幀和最小值分割,抱歉暫時沒有時間編碼。例如:'df1 < - data.frame(level,value,stringsAsFactors = FALSE);拆分(df1,cumsum(df1 $ level == 1))'然後刪除最小值,並分割下一個最小值,等等。 – zx8754

+1

我也在想這個,但不知道如何標記每個孩子到正確的父母。換句話說,我們如何防止給第一父母標記第二等級2的值。 – Boxuan

回答

3

使用data.table式的合併:

library(data.table) 
dt = data.table(idx=1:length(value), level, parent=value) 

dt = dt[dt[, .(i=idx, level=level-1, child=parent)], on=.(level, idx < i), mult='last'] 

dt[is.na(parent), parent:= 'root'][, c('idx','level'):= NULL] 

> dt 
#  parent child 
# 1: root  a 
# 2:  a  b 
# 3:  b  c 
# 4:  c  d 
# 5:  c  e 
# 6:  b  f 
# 7:  f  g 
# 8:  f  h 
# 9: root  i 
# 10:  i  j 
# 11:  j  k 

現在我們可以使用來自其他post解決方案:

x = maketreelist(as.data.frame(dt)) 

> identical(x, my_list) 
# [1] TRUE 
+0

這太棒了。謝謝!我試圖理解代碼,所以對你來說一個簡單的問題:你的第二行是否像交叉連接和最後一行層次過濾? – Boxuan

+1

Np。第二行是通過上一次匹配過濾的非平等加入。請參閱https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/Efficient-in-memory-non-equi-joins-using-datatable – sirallen

1

作爲序言,你的數據是很難,因爲關鍵信息工作按照level中的值的順序進行編碼。我不知道你是如何按照這個順序得到這些值的,但考慮到可能有更好的方法來構建這些信息,這將使下一個任務更容易。

這裏的轉換數據到數據幀2列,parentchild,然後傳遞到這一點data.tree功能,可以很容易地轉換到你需要的JSON格式的base -y方式......然後把它傳遞到radialNetwork ...

level <- c(1, 2, 3, 4, 4, 3, 4, 4, 1, 2, 3) 
value <- letters[1:11] 

library(data.tree) 
library(networkD3) 

parent_idx <- sapply(1:length(level), function(n) rev(which(level[1:n] < level[n]))[1]) 
df <- data.frame(parent = value[parent_idx], child = value, stringsAsFactors = F) 
df$parent[is.na(df$parent)] <- "" 

list <- ToListExplicit(FromDataFrameNetwork(df), unname = T) 
radialNetwork(list) 

這裏是達到相同的tidyverse方式...

level <- c(1, 2, 3, 4, 4, 3, 4, 4, 1, 2, 3) 
value <- letters[1:11] 

library(tidyverse) 
library(data.tree) 
library(networkD3) 

data.frame(level, value, stringsAsFactors = F) %>% 
    mutate(row = row_number()) %>% 
    mutate(level2 = level, value2 = value) %>% 
    spread(level2, value2) %>% 
    mutate(`0` = "") %>% 
    arrange(row) %>% 
    fill(-level, -value, -row) %>% 
    gather(parent_level, parent, -level, -value, -row) %>% 
    filter(parent_level == level - 1) %>% 
    arrange(row) %>% 
    select(parent, child = value) %>% 
    data.tree::FromDataFrameNetwork() %>% 
    data.tree::ToListExplicit(unname = TRUE) %>% 
    radialNetwork() 

和獎金,012目前的開發版本(v0.4.9000)有一個新的treeNetwork函數,它需要一個包含nodeIdparentId列/變量的數據幀,這消除了將data.tree函數轉換爲JSON的需要,所以類似這樣的工作...

level <- c(1, 2, 3, 4, 4, 3, 4, 4, 1, 2, 3) 
value <- letters[1:11] 

library(tidyverse) 
library(networkD3) 

data.frame(level, value, stringsAsFactors = F) %>% 
    mutate(row = row_number()) %>% 
    mutate(level2 = level, value2 = value) %>% 
    spread(level2, value2) %>% 
    mutate(`0` = "root") %>% 
    arrange(row) %>% 
    fill(-level, -value, -row) %>% 
    gather(parent_level, parent, -level, -value, -row) %>% 
    filter(parent_level == level - 1) %>% 
    arrange(row) %>% 
    select(nodeId = value, parentId = parent) %>% 
    rbind(data.frame(nodeId = "root", parentId = NA)) %>% 
    mutate(name = nodeId) %>% 
    treeNetwork(direction = "radial")