2017-04-13 78 views
2

我的最終遊戲是使用D3js從分層JSON文件創建樹形圖。R使用JSONLITE分層JSON?

我需要表示的等級是這張圖,其中A有孩子B,C,D; B有孩子E,F,G; C有孩子H,I; D沒有孩子。節點將具有多個鍵:值對。爲簡單起見,我僅列出3個。 R中

       -- name:E 
          | type:dkBlue 
          | id: 005 
          | 
          |-- name:F 
      -- name:B ------| type:medBlue 
      | type:blue | id: 006 
      | id:002  | 
      |    |-- name:G 
      |     type:ltBlue 
name:A ----|     id:007  
type:colors| 
id:001  |-- name:C ----|-- name:H 
      | type:red | type:dkRed   
      | id:003  | id:008 
      |    | 
      |    | 
      |    |-- name:I 
      |     type:medRed 
      |     id:009 
      |-- name:D 
       type:green 
       id: 004 

我的源數據是這樣的:

nodes <-read.table(header = TRUE, text = " 
ID name type 
001 A colors 
002 B blue 
003 C red 
004 D green 
005 E dkBlue 
006 F medBlue 
007 G ltBlue 
008 H dkRed 
009 I medRed 
") 

links <- read.table(header = TRUE, text = " 
startID relation endID  
001  hasSubCat 002 
001  hasSubCat 003 
001  hasSubCat 004 
002  hasSubCat 005 
002  hasSubCat 006 
002  hasSubCat 007 
003  hasSubCat 008 
003  hasSubCat 009 
") 

我必須把它轉換成以下JSON:

{"name": "A", 
"type": "colors", 
"id" : "001", 
"children": [ 
    {"name": "B", 
     "type": "blue", 
     "id" : "002", 
     "children": [ 
      {"name": "E", 
      "type": "dkBlue", 
      "id" : "003"}, 
      {"name": "F", 
      "type": "medBlue", 
      "id": "004"}, 
      {"name": "G", 
      "type": "ltBlue", 
      "id": "005"} 
    ]}, 
    {"name": "C", 
     "type": "red", 
     "id" : "006", 
     "children": [ 
      {"name": "H", 
      "type": "dkRed", 
      "id" : "007"}, 
      {"name": "I", 
      "type": "dkBlue", 
      "id": "008"} 
    ]}, 
    {"name": "D", 
     "type": "green", 
     "id" : "009"} 
]} 

我希望你可以提供任何幫助!

[更新2017年4月18日]

基於伊恩的引用我看着成R的data.tree。如果我重構我的數據,我可以重新創建我的層次結構,如下所示。請注意,我已經失去了每個節點之間的關係類型(hasSubcat),其值在現實生活中對於每個鏈接/邊緣都會有所不同。如果我能得到可行的層次結構,我願意放手(現在)。對於data.tree修訂後的數據:

df <-read.table(header = TRUE, text = " 
paths type  id 
A  colors 001 
A/B blue  002 
A/B/E dkBlue 005 
A/B/F medBlue 006 
A/B/G ltBlue 007 
A/C red  003 
A/C/H dkRed 008 
A/C/I medRed 009 
A/D green 004 
") 

myPaths <- as.Node(df, pathName = "paths") 
myPaths$leafCount/(myPaths$totalCount - myPaths$leafCount) 
print(myPaths, "type", "id", limit = 25) 

打印顯示我在原來的職位勾勒出層次,甚至包含鍵:對每個節點的值。太好了!

levelName type id 
1 A   colors 1 
2 ¦--B  blue 2 
3 ¦ ¦--E dkBlue 5 
4 ¦ ¦--F medBlue 6 
5 ¦ °--G ltBlue 7 
6 ¦--C   red 3 
7 ¦ ¦--H dkRed 8 
8 ¦ °--I medRed 9 
9 °--D  green 4 

再一次,我對如何將這個從樹轉換爲嵌套的JSON感到遺憾。這裏的示例https://ipub.com/data-tree-to-networkd3/與大多數示例一樣,僅在葉節點上使用鍵:值對,而不在分支節點上使用。我認爲答案是創建一個嵌套的列表來提供給JSONIO或JSONLITE,我不知道該怎麼做。

+1

你可能想看看這個:http://stackoverflow.com/questions/12818864/how-to-write-to-json-with-children-from-r –

+0

嗨伊恩,你舉的例子讓我接近,但我正在努力使它適應於我爲樹中每個「節點」所需的Key:Value對的點。該示例中的遞歸方法僅爲終端節點提供了鍵值對。 – Tim

+0

蒂姆,你的問題很複雜,我需要破解一下,不幸的是我現在沒有時間。有人比我更擅長解決問題的速度。如果您在遞歸方法中遇到問題,另一個選擇是從頂部向下構建一棵樹,這個樹更容易概念化。這裏是data.tree包的參賽者:https://cran.r-project.org/web/packages/data.tree/vignettes/data.tree.html。您可以添加每個孩子,然後按名稱爲每個孩子添加屬性。然後,您可以使用以下內容將這些導出到JSON: –

回答

1

data.tree是非常有用的,可能是更好的方式來實現您的目標。爲了好玩,我將提交一個更迂迴的方式來實現使用igraphd3r嵌套JSON

nodes <-read.table(header = TRUE, text = " 
ID name type 
001 A colors 
002 B blue 
003 C red 
004 D green 
005 E dkBlue 
006 F medBlue 
007 G ltBlue 
008 H dkRed 
009 I medRed 
") 

links <- read.table(header = TRUE, text = " 
startID relation endID  
001  hasSubCat 002 
001  hasSubCat 003 
001  hasSubCat 004 
002  hasSubCat 005 
002  hasSubCat 006 
002  hasSubCat 007 
003  hasSubCat 008 
003  hasSubCat 009 
") 

library(d3r) 
library(dplyr) 
library(igraph) 

# make it an igraph 
gf <- graph_from_data_frame(links[,c(1,3,2)],vertices = nodes) 

# if we know that this is a tree with root as "A" 
# we can do something like this 
df_tree <- dplyr::bind_rows(
    lapply(
    all_shortest_paths(gf,from="A")$res, 
    function(x){data.frame(t(names(unclass(x))), stringsAsFactors=FALSE)} 
) 
) 

# we can discard the first column 
df_tree <- df_tree[,-1] 
# then make df_tree[1,1] as 1 (A) 
df_tree[1,1] <- "A" 

# now add node attributes to our data.frame 
df_tree <- df_tree %>% 
    # let's get the last non-NA in each row so we can join with nodes 
    mutate(
    last_non_na = apply(df_tree, MARGIN=1, function(x){tail(na.exclude(x),1)}) 
) %>% 
    # now join with nodes 
    left_join(
    nodes, 
    by = c("last_non_na" = "name") 
) %>% 
    # now remove last_non_na column 
    select(-last_non_na) 

# use d3r to nest as we would like 
nested <- df_tree %>% 
    d3_nest(value_cols = c("ID", "type")) 
+0

這非常接近。一個小問題:d3_nest會生成一個名爲「root」的預期根節點,產生一個啓動的樹:root - > A - > ...如果我指定d3_nest參數root =「A」,則只會重命名「root」到'A',產生:A - > A - > ...有沒有辦法讓A作爲根節點?在df_tree中一切看起來不錯。 '#now添加屬性'之前的' – Tim

+0

',你可以'df_tree < - df_tree [-1,] df_tree < - df_tree [, - 1]'然後使用'd3_nest(...,root =「A」) '但你會失去'A'的屬性。 – timelyportfolio

+0

你也可以'嵌套<- df_tree %>% d3_nest(value_cols = c(「ID」,「type」),json = FALSE)'然後'd3_json(nested [1] $ children [[1]],strip = TRUE) ' – timelyportfolio

1

考慮正走在水平反覆轉換數據框列多嵌套列表:

library(jsonlite) 
... 
df2list <- function(i) as.vector(nodes[nodes$name == i,]) 

# GRANDPARENT LEVEL 
jsonlist <- as.list(nodes[nodes$name=='A',]) 
# PARENT LEVEL  
jsonlist$children <- lapply(c('B','C','D'), function(i) as.list(nodes[nodes$name == i,])) 
# CHILDREN LEVEL 
jsonlist$children[[1]]$children <- lapply(c('E','F','G'), df2list) 
jsonlist$children[[2]]$children <- lapply(c('H','I'), df2list) 

toJSON(jsonlist, pretty=TRUE) 

但是,使用這種方法,你會發現一個長度元素內部的一些兒童被封閉在括號。由於R在字符向量中不能包含複雜類型,因此整個對象必須是以括號形式輸出的列表類型。

因此,考慮嵌套gsub額外的括號的清理仍呈現有效的JSON:

output <- toJSON(jsonlist, pretty=TRUE) 

gsub('"\\]\n', '"\n', gsub('"\\],\n', '",\n', gsub('": \\["', '": "', output))) 

最終輸出

{ 
    "ID": "001", 
    "name": "A", 
    "type": "colors", 
    "children": [ 
    { 
     "ID": "002", 
     "name": "B", 
     "type": "blue", 
     "children": [ 
     { 
      "ID": "005", 
      "name": "E", 
      "type": "dkBlue" 
     }, 
     { 
      "ID": "006", 
      "name": "F", 
      "type": "medBlue" 
     }, 
     { 
      "ID": "007", 
      "name": "G", 
      "type": "ltBlue" 
     } 
     ] 
    }, 
    { 
     "ID": "003", 
     "name": "C", 
     "type": "red", 
     "children": [ 
     { 
      "ID": "008", 
      "name": "H", 
      "type": "dkRed" 
     }, 
     { 
      "ID": "009", 
      "name": "I", 
      "type": "medRed" 
     } 
     ] 
    }, 
    { 
     "ID": "004", 
     "name": "D", 
     "type": "green" 
    } 
    ] 
} 
+0

偉大的解決方案,但與更復雜的層次結構,這變得更加困難。 – timelyportfolio

1

一個不錯的,如果有點難以繞到一個人的頭上,這樣做的方法是使用一個自引用函數,如下所示...

nodes <- read.table(header = TRUE, colClasses = "character", text = " 
ID name type 
001 A colors 
002 B blue 
003 C red 
004 D green 
005 E dkBlue 
006 F medBlue 
007 G ltBlue 
008 H dkRed 
009 I medRed 
") 

links <- read.table(header = TRUE, colClasses = "character", text = " 
startID relation endID  
001  hasSubCat 002 
001  hasSubCat 003 
001  hasSubCat 004 
002  hasSubCat 005 
002  hasSubCat 006 
002  hasSubCat 007 
003  hasSubCat 008 
003  hasSubCat 009 
") 

convert_hier <- function(linksDf, nodesDf, sourceId = "startID", 
         targetId = "endID", nodesID = "ID") { 
    makelist <- function(nodeid) { 
    child_ids <- linksDf[[targetId]][which(linksDf[[sourceId]] == nodeid)] 

    if (length(child_ids) == 0) 
     return(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ])) 

    c(as.list(nodesDf[nodesDf[[nodesID]] == nodeid, ]), 
     children = list(lapply(child_ids, makelist))) 
    } 

    ids <- unique(c(linksDf[[sourceId]], linksDf[[targetId]])) 
    rootid <- ids[! ids %in% linksDf[[targetId]]] 
    jsonlite::toJSON(makelist(rootid), pretty = T, auto_unbox = T) 
} 

convert_hier(links, nodes) 

幾個音符...

  1. 我添加colClasses = "character"read.table命令,以便ID號不強制爲整數,沒有前導零,因此該字符串不會轉換爲因素。
  2. 我在convert_hier函數中包裹了所有的東西,使其更容易適應其他場景,但真正的魔力在於makelist函數。