2016-11-12 180 views
0

我正在從Twitter API中獲取數據。將數據從JSON對象轉換爲數據框並加載到數據倉庫。查找下面的輸入和代碼片段。將嵌套的JSON對象轉換爲R中的數據幀

我對R編程非常陌生。

stats_campaign.data <- content(stats_campaign.request) 
print(stats_campaign.data) 

O/P:

`{ 
"data_type": [ "stats" ], 
"time_series_length": [ 1 ], 
"data": [ 
{ 
    "id": [ "XXXXX" ], 
    "id_data": [ 
    { 
     "segment": {}, 
     "metrics": { 
     "impressions": {}, 
     "tweets_send": {}, 
     "qualified_impressions": {}, 
     "follows": {}, 
     "app_clicks": {}, 
     "retweets": {}, 
     "likes": {}, 
     "engagements": {}, 
     "clicks": {}, 
     "card_engagements": {}, 
     "replies": {}, 
     "url_clicks": {}, 
     "carousel_swipes": {} 
     } 
    } 
    ] 
    }, 

    {  
    "id": [ "XXXX1" ], 
    "id_data": [ 
    { 
     "segment": {}, 
     "metrics": { 
     "impressions": {}, 
     "tweets_send": {}, 
     "qualified_impressions": {}, 
     "follows": {}, 
     "app_clicks": {}, 
     "retweets": {}, 
     "likes": {}, 
     "engagements": {}, 
     "clicks": {}, 
     "card_engagements": {}, 
     "replies": {}, 
     "url_clicks": {}, 
     "carousel_swipes": {} 
     } 
    } 
    ] 
    },` 

當我讀這個JSON值,

stats_json_file <- sprintf("P:/R Repos/R  
       Applications/TwitterAPIData/stats_test_data-%s.json", TODAY) 
    jsonlite::fromJSON(stats_json_file) 

    **Result :** 
     id          id_data 
    1 5wcaz           NULL 
    2 5ub2u           NULL 
    3 5wb8x           NULL 
    4 5wb1j           NULL 
    5 5yqwj           NULL 
    6 5pq5i           NULL 
    7 5u197           NULL 
    8 5z2js           NULL 
    9 6fqh0 333250, 4, 9, 19, 111, 3189, 3156, 5, 1091 
    10 5tvr1           NULL 
    11 5yqw4           NULL 
    12 5qqps           NULL 
    13 5yqvw           NULL 
    14 5ygom           NULL 
    15 5nc88           NULL 
    16 5yg94           NULL 
    17 65t9e           NULL 
    18 5peck           NULL 
    19 63pg1 247283, 17, 22, 35, 297, 5514, 5450, 6, 2971 
    20 6cdvy  156705, 1, 2, 6, 112, 10933, 605, 170 

    From my JSON file I want Id and whole "metrics": { 
     "impressions": {}, 
     "tweets_send": {}, 
     "qualified_impressions": {}, 
     "follows": {}, 
     "app_clicks": {}, 
     "retweets": {}, 
     "likes": {}, 
     "engagements": {}, 
     "clicks": {}, 
     "card_engagements": {}, 
     "replies": {}, 
     "url_clicks": {}, 
     "carousel_swipes": {} 
     } 
     and convert to Data Frame to load into Data Base. Plzz Help..! 

我如何解析這個JSON對象。我想檢索整個Metrics對象的Id &。然後想要轉換成數據框以加載到SQL表中。

讀書,我用下面的代碼的多個標識的&指標值,

`test <- list() 
for(i in 1:len) 
{ test <- unlist(stats_campaign.data$data[[i]]) 
print(test)}` 

**Output:** 
     id 
    "5wcaz" 
     id 
    "5ub2u" 
     id 
    "5wb8x" 
     id 
"5wb1j" 
     id 
"5yqwj" 
     id 
    "5pq5i" 
     id 
    "5u197" 
     id 
    "5z2js" 
     id 
    "5tvr1" 
     id 
    "5yqw4" 
     id 
    "5qqps" 
     id 
    "5yqvw" 
     id 
    "5ygom" 
     id 
    "5nc88" 
     id 
    "5yg94" 
     id 
    "65t9e" 
     id 
    "5peck" 
        id id_data.metrics.impressions 
        "63pg1"     "133227" 
         id_data.metrics.tweets_send  id_data.metrics.follows 
        "10"       "9" 
         id_data.metrics.retweets  id_data.metrics.likes 
        "17"      "96" 
        id_data.metrics.engagements  id_data.metrics.clicks 
       "2165"      "2134" 
        id_data.metrics.replies id_data.metrics.url_clicks 
        "5"      "1204" 
        id id_data.metrics.impressions 
       "6cdvy"     "176164" 
    id_data.metrics.tweets_send id_data.metrics.retweets 
        "2"      "10" 
    id_data.metrics.likes id_data.metrics.engagements 
        "121"      "9708" 
    id_data.metrics.clicks id_data.metrics.url_clicks 
        "620"      "160" 

在一個爲我所用列表或別的東西每次追加的價值,我怎麼能做到這一點..? ?我正在使用正確的方法嗎?有沒有其他方法可以解析嵌套的JSON對象,並直接放入數據框..?

請幫助..!提前致謝..!

+2

如果您的JSON在語法上有效,那麼在R中,您可以執行'jsonlite :: fromJSON(your_text)'。不過,你的括號似乎有一些問題。 – Gregor

+0

這是我的JOSN FOrmat, –

+0

好的,你的JSON現在是有效的。你可以在其上運行'jsonlite :: fromJSON(your_text)'並獲得有用的結果。你想要什麼?而不是顯示你*不需要的輸出,你能顯示你想要的輸出嗎? – Gregor

回答

0

正如在評論中提到的,關於的更多信息,你在尋找什麼輸出會有所幫助。無論如何,我希望以下內容能夠提供有益的指導。 tidyjson README提供了一些有用的概述。

不幸的是,由於缺少JSON對象中的數據,很難說明數據中可能存在什麼(空對象中會出現什麼內容),並且我很難確定您所使用的Twitter API的哪一部分看着。 tidyjson即使在沒有數據的情況下也能夠生成一致的data.frame輸出!關鍵動詞是gatherspread,很像tidyr,但具有JSON風味。

str <- "{\"data_type\":[\"stats\"],\"time_series_length\":[1],\"data\":[{\"id\":[\"XXXXX\"],\"id_data\":[{\"segment\":{},\"metrics\":{\"impressions\":{},\"tweets_send\":{},\"qualified_impressions\":{},\"follows\":{},\"app_clicks\":{},\"retweets\":{},\"likes\":{},\"engagements\":{},\"clicks\":{},\"card_engagements\":{},\"replies\":{},\"url_clicks\":{},\"carousel_swipes\":{}}}]},{\"id\":[\"XXXX1\"],\"id_data\":[{\"segment\":{},\"metrics\":{\"impressions\":{},\"tweets_send\":{},\"qualified_impressions\":{},\"follows\":{},\"app_clicks\":{},\"retweets\":{},\"likes\":{},\"engagements\":{},\"clicks\":{},\"card_engagements\":{},\"replies\":{},\"url_clicks\":{},\"carousel_swipes\":{}}}]}]} " 

library(dplyr) 
library(tidyjson) 

prep <- as.tbl_json(str) %>% enter_object("data") %>% gather_array("objid") 

p1 <- prep %>% enter_object("id") %>% 
    gather_array("idnum") %>% append_values_string("id") 

p2 <- prep %>% enter_object("id_data") %>% gather_array("datanum") %>% 
enter_object("metrics") %>% 
spread_values(
impressions = jstring("impressions", "value") 
, tweets_send = jnumber("tweets_send", "somekey") 
) 

p1 %>% tbl_df() %>% left_join(p2 %>% tbl_df(), by = c("document.id", "objid")) 
#> # A tibble: 2 x 7 
#> document.id objid idnum id datanum impressions tweets_send 
#>   <int> <int> <int> <chr> <int>  <chr>  <dbl> 
#> 1   1  1  1 XXXXX  1  <NA>   NA 
#> 2   1  2  1 XXXX1  1  <NA>   NA 
相關問題