使用tidyjson傳播值的問題

我想將下面的多文檔JSON文件轉換爲data.frame。使用tidyjson傳播值的問題

x = '[ 
    {"name": "Bob","groupIds": ["kwt6x61", "yiahf43"]}, 
    {"name": "Sally","groupIds": "yiahf43"} 
]'

我幾乎沒有使用

y = x %>% gather_array() %>% 
    spread_values(
    name = jstring("name"), 
    groupIds = jstring("groupIds") 
) 
print(y)

將返回：

document.id array.index name     groupIds 
1   1   1 Bob list("kwt6x61", "yiahf43") 
2   1   2 Sally     yiahf43

有人可以幫助傳播groupsIds到addtional行？

來源

2016-03-10 Rob A

這是一個有趣的問題。這個問題源於一個數組1被存儲爲一個字符串的事實。否則，enter_object('groupIds') %>% gather_array %>% append_values_string會很好地工作。 tidyjson似乎不能很好地處理這種情況。我想知道這是否會被認爲是有效的JSON，因爲在一種情況下，groupIds是一個字符串，而在另一種情況下，它是一個數組。

在任何情況下，雖然這不是一個理想的解決方案，但您可以使用json_types()來說明不同之處，然後有條件地對待它們。我將其轉換爲tbl_df（即刪除的JSON組件），以便在完成分析時進行未來處理。

library(tidyjson) 
library(dplyr) 
library(tidyr) 

x = '[ 
    {"name": "Bob","groupIds": ["kwt6x61", "yiahf43"]}, 
    {"name": "Sally","groupIds": "yiahf43"} 
]' 

## Show the different types 
z <- x %>% gather_array() %>% spread_values(
    name=jstring('name') 
) %>% enter_object('groupIds') %>% json_types() 

## Conditionally treat each 
final <- bind_rows(
    z[z$type=='array',] %>% gather_array('id') %>% append_values_string('groupId') 
    , z[z$type=='string',] %>% append_values_string('groupId') %>% mutate(id=1) 
) %>% tbl_df 

## Spread them out, maybe? Depends on what you're looking for 
final %>% spread('id','groupId')

來源

2017-04-27 02:51:05 cole

使用tidyjson傳播值的問題

回答

相關問題