如何使用Python將JSON（Twitter數據）轉換爲CSV

我試圖查詢twitter搜索引擎（search.twitter.com），將結果轉換爲json，然後將結果準備爲csv用於研究項目。我是一名Python新手，但我已經設法自己編寫2/3的程序。但是，我很難將我的json文件轉換爲csv格式。我嘗試了各種建議的技術，但沒有成功。我在這裏做錯了什麼？如何使用Python將JSON（Twitter數據）轉換爲CSV

這是我到目前爲止有：

import twitter, os, json, csv 

qname = raw_input("Please enter the term(s) you wish to search for: ") 
date = int(raw_input("Please enter today's date (no dashes or spaces): ")) 
nname = raw_input("Please enter a nickname for this query (no spaces): ") 
q1 = raw_input("Would you like to set a custom directory? Enter Yes or No: ") 

if q1 == 'No' or 'no' or 'n' or 'N': 
    dirname = 'C:\Users\isaac\Desktop\TPOP' 

elif q1 == 'Yes' or 'yes' or 'y' or 'Y': 
    dirname = raw_input("Please enter the directory path:") 

ready = raw_input("Are you ready to begin? Enter Yes or No: ") 
while ready == 'Yes' or 'yes' or 'y' or 'Y': 
    twitter_search = twitter.Twitter(domain = "search.Twitter.com") 
search_results = [] 
for page in range (1,10): 
    search_results.append(twitter_search.search(q=qname, rpp=1, page=page)) 
    ready1 = raw_input("Done! Are you ready to continue? Enter Yes or No: ") 
    if ready1 == 'Yes' or 'yes' or 'y' or 'Y': 
     break 

ready3 = raw_input("Do you want to save output as a file? Enter Yes or No: ") 
while ready3 == 'Yes' or 'yes' or 'y' or 'Y': 
    os.chdir(dirname) 
    filename = 'results.%s.%06d.json' %(nname,date) 
    t = open (filename, 'wb+') 
    s = json.dumps(search_results, sort_keys=True, indent=2) 
    print >> t,s 
    t.close() 
    ready4 = raw_input("Done! Are you ready to continue? Enter Yes or No: ") 
    if ready4 == 'Yes' or 'yes' or 'y' or 'Y': 
     break 

ready5 = raw_input("Do you want to save output as a csv/excel file? Enter Yes or No: ") 
while ready5 == 'Yes' or 'yes' or 'y' or 'Y': 
    filename2 = 'results.%s.%06d.csv' %(nname,date) 
    z = json.dumps(search_results, sort_keys=True, indent=2) 
    x=json.loads(z) 

    json_string = z 
    json_array = x 

    columns = set() 
    for entity in json_array: 
     if entity == "created_at" or "from_user" or "from_user_id" or "from_user_name" or "geo" or "id" or "id_str" or "iso_language_code" or "text": 
      columns.update(set(entity)) 

    writer = csv.writer(open(filename2, 'wb+')) 
    writer.writerow(list(columns)) 
    for entity in json_array: 
     row = [] 
     for c in columns: 
      if c in entity: row.append(str(entity[c])) 
      else: row.append('')

來源

2012-02-22 wsisaac

你看到了什麼問題？ – 2012-02-22 03:38:37

「將結果轉換爲json，然後將結果準備爲csv」應該如何工作？ – 2012-02-22 03:39:01

你想要輸出看起來像什麼？「key1：value1，key2：value2，..」或「key1，key2，key3 ... \ n value1，value2，value3，...」（如由換行符分隔的列標題） – platinummonkey 2012-02-22 03:41:31

一些周圍搜索後，我發現這裏的答案：http://michelleminkoff.com/2011/02/01/making-the-structured-usable-transform-json-into-a-csv/

代碼應該是這個樣子：（如果您正在搜索twitter python api）

filename2 = '/path/to/my/file.csv' 
writer = csv.writer(open(filename2, 'w')) 
z = json.dumps(search_results, sort_keys=True, indent=2) 
parsed_json=json.loads(z) 
#X needs to be the number of page you pulled less one. So 5 pages would be 4. 
while n<X: 
for tweet in parsed_json[n]['results']: 
    row = [] 
    row.append(str(tweet['from_user'].encode('utf-8'))) 
    row.append(str(tweet['created_at'].encode('utf-8'))) 
    row.append(str(tweet['text'].encode('utf-8'))) 
    writer.writerow(row) 
n = n +1

感謝大家的幫助！

來源

2012-08-19 12:27:51 wsisaac

你應該接受你自己的答案來標記問題如答覆。 – j0k 2012-08-20 09:48:06

對不起！我不知道該怎麼做。現在就做出改變。 – wsisaac 2012-08-21 13:50:24

你必須要在幾個不同的問題。

首先，中

x == 'a' or 'b' or 'c'

語法可能不會做你認爲它。您應該使用

x in ('a', 'b', 'c')

改爲。

其次，您的ready5變量不會改變，並且不會在循環中正常工作。嘗試

while True: 
    ready5 = raw_input("Do you want to save output as a csv/excel file? Enter Yes or No: ") 
    if ready5 not in (...): 
     break

最後，您的轉儲/加載代碼有問題。你從Twitter獲得的應該是一個JSON字符串。有些代碼是你在問題中遺漏的，所以我無法確定，但我認爲你根本不想使用json.dumps。您從閱讀JSON（使用json.loads）和寫入 CSV（使用csv.writer.writerow）。

來源

2012-02-22 03:52:58

謝謝大家的意見！我會嘗試對代碼進行這些更改。我實際上把其餘的代碼放在了你的頭上。我在網上看到的大部分示例都提示了讀取json/write csv組合的一些變體。我希望有一個csv文檔，其中包含來自推文搜索的所有基本信息（即用戶標識，大地水準面，ISO代碼，文本等）。如果我只是做一個通用的轉儲，格式似乎都搞砸了。 – wsisaac 2012-02-22 12:25:32

一種不同的方法將有tablib爲你做實際的轉換：

import tablib 
data = tablib.Dataset() 
data.json = search_results 
filename = 'results.%s.%06d.csv' %(nname,date) 
csv_file = open(filename, 'wb') 
csv_file.write(data.csv)

來源

2012-02-22 15:10:23

這是否處理嵌套的數據？ – 2013-04-28 16:46:06

看起來不是，它默默地寫垃圾（提交一個bug：https：//github.com/kennethreitz/tablib/issues/100）。但是你可以通過遍歷第一維並編寫多個「Databooks」來調整它以處理三維。 – 2013-04-28 23:01:25

有一個更好的解決方案（我不記得引用），它利用了一些遞歸。這是我的更新後的帖子的鏈接：http://theoryno3.blogspot.com/2013/04/how-to-convert-json-to-csv-in-python.html – 2013-04-29 20:24:08

如何使用Python將JSON（Twitter數據）轉換爲CSV

回答

相關問題