提取從嵌套JSON對象數據以特定的格式在Python

我有一個包含嵌套JSON對象的多個數如下所示的數據集：提取從嵌套JSON對象數據以特定的格式在Python

{ 
"coordinates": null, 
"acoustic_features": { 
    "instrumentalness": "0.00479", 
    "liveness": "0.18", 
    "speechiness": "0.0294", 
    "danceability": "0.634", 
    "valence": "0.342", 
    "loudness": "-8.345", 
    "tempo": "125.044", 
    "acousticness": "0.00035", 
    "energy": "0.697", 
    "mode": "1", 
    "key": "6" 
}, 
"artist_id": "b2980c722a1ace7a30303718ce5491d8", 
"place": null, 
"geo": null, 
"tweet_lang": "en", 
"source": "Share.Radionomy.com", 
"track_title": "8eeZ", 
"track_id": "cd52b3e5b51da29e5893dba82a418a4b", 
"artist_name": "Dominion", 
"entities": { 
    "hashtags": [{ 
     "text": "nowplaying", 
     "indices": [0, 11] 
    }, { 
     "text": "goth", 
     "indices": [51, 56] 
    }, { 
     "text": "deathrock", 
     "indices": [57, 67] 
    }, { 
     "text": "postpunk", 
     "indices": [68, 77] 
    }], 
    "symbols": [], 
    "user_mentions": [], 
    "urls": [{ 
     "indices": [28, 50], 
     "expanded_url": "cathedral13.com/blog13", 
     "display_url": "cathedral13.com/blog13", 
     "url": "t.co/Tatf4hEVkv" 
    }] 
}, 
"created_at": "2014-01-01 05:54:21", 
"text": "#nowplaying Dominion - 8eeZ Tatf4hEVkv #goth #deathrock #postpunk", 
"user": { 
    "location": "middle of nowhere", 
    "lang": "en", 
    "time_zone": "Central Time (US & Canada)", 
    "name": "Cathedral 13", 
    "entities": null, 
    "id": 81496937, 
    "description": "I\u2019m a music junkie who is currently responsible for 
Cathedral 13 internet radio (goth, deathrock, post-punk)which has been online 
since 06/20/02." 
}, 
"id": 418243774842929150 
}

我要輸出文件，看起來具有格式：

user_id1 - track_id - hashtag1 
user_id1 - track_id - hashtag2 
user_id1 - track_id - hashtag3 
user_id2 - track_id - hashtag1 
user_id2 - track_id - hashtag2 
....

就是這個例子的輸出應該是：

81496937 cd52b3e5b51da29e5893dba82a418a4b nowplaying 
81496937 cd52b3e5b51da29e5893dba82a418a4b goth 
81496937 cd52b3e5b51da29e5893dba82a418a4b deathrock 
81496937 cd52b3e5b51da29e5893dba82a418a4b postpunk

我寫的以下代碼可以做到這一點：

import json 
import csv 
with open('final_dataset_json.json') as data_file: 
     data = json.load(data_file) 

uth = open('uth.csv','wb') 

cvwriter = csv.writer(uth) 

for entry in data: 
    text_list = [hashtag['text'] for hashtag in entry['entities']['hashtags']] 
    for line in text_list: 
     csvwriter.writerow([entry['user']['id'],entry['track_id'],line.strip()+'\n') 

uth.close()

如何才能實現給定的輸出？

來源

2017-07-17 Asmita Poddar

你還沒有說你與你的代碼有什麼問題（S）。 –

在csvwriter中，如果要寫入新行，必須將所有列數據發送到列表中。

我希望如果你替換這條線就足夠了。

csvwriter.writerow([entry['user']['id'],entry['track_id'],line.strip()])

來源

2017-07-17 10:38:37

我得到以下錯誤，我不明白爲什麼。標識沒有問題：csvwriter.writerow（[entry ['user'] ['id']，entry ['track_id']，line.strip（）]） NameError：name'csvwriter'未定義 –

Did你在你的代碼中導入它？ – BoboDarph

@AsmitaPoddar在你的代碼中它是cvwriter.writerow（[entry ['user'] ['id']，entry ['track_id']，line.strip（）]）其中cvwriter將指定你要寫入哪個文件數據 –

簡單的字典查找（JSON有一個模塊）

import json 
d = json.loads(json_str) 
for ht in d['entities']['hashtags']: 
    print '{} - {} - {}'.format(d['user']['id'], d['artist_id'], ht['text'])

Yeilds：

81496937 - b2980c722a1ace7a30303718ce5491d8 - nowplaying 
81496937 - b2980c722a1ace7a30303718ce5491d8 - goth 
81496937 - b2980c722a1ace7a30303718ce5491d8 - deathrock 
81496937 - b2980c722a1ace7a30303718ce5491d8 - postpunk

來源

2017-07-17 10:45:42 Vinny

我想將其存儲在csv文件中。我有多個json對象，我想這樣做。 –

提取從嵌套JSON對象數據以特定的格式在Python

回答

相關問題