JSON字符串解碼錯誤

我打電話的網址：使用的urllib2和解碼使用JSON模塊JSON字符串解碼錯誤

url = "http://code.google.com/feeds/issues/p/chromium/issues/full/291?alt=json" 
request = urllib2.Request(query) 
response = urllib2.urlopen(request) 
issue_report = json.loads(response.read())

我碰到下面的錯誤

http://code.google.com/feeds/issues/p/chromium/issues/full/291?alt=json

：

ValueError: Invalid control character at: line 1 column 1120 (char 1120)

我試圖檢查標題，我得到以下內容：

Content-Type: application/json; charset=UTF-8 
Access-Control-Allow-Origin: * 
Expires: Sun, 03 Jul 2011 17:38:38 GMT 
Date: Sun, 03 Jul 2011 17:38:38 GMT 
Cache-Control: private, max-age=0, must-revalidate, no-transform 
Vary: Accept, X-GData-Authorization, GData-Version 
GData-Version: 1.0 
ETag: W/"CUEGQX47eCl7ImA9WxJaFEw." 
Last-Modified: Tue, 04 Aug 2009 19:20:20 GMT 
X-Content-Type-Options: nosniff 
X-Frame-Options: SAMEORIGIN 
X-XSS-Protection: 1; mode=block 
Server: GSE 
Connection: close

我還嘗試添加的編碼參數如下：

issue_report = json.loads(response.read() , encoding = 'UTF-8')

我仍然會碰到同樣的錯誤。

來源

2011-07-03 Dexter

看起來，你得到了什麼不是有效的json編碼的字符串。 – hakre

太好了，謝謝！我會照辦的。 – Dexter

該提要在此處包含來自JPEG中的原始數據; JSON格式不正確，所以這不是你的錯。向Google報告錯誤。

來源

2011-07-03 17:49:45

哦！我懷疑人們早些時候也遇到過這樣的問題。 http://code.google.com/p/gdata-issues/issues/detail?id=942 – Dexter

您可以考慮使用lxml代替，因爲JSON格式不正確。這是XPath的支持使得用XML非常直接的工作：

import lxml.etree 
url = 'http://code.google.com/feeds/issues/p/chromium/issues/full/291' 
doc = lxml.etree.parse(url) 
ns = {'issues': 'http://schemas.google.com/projecthosting/issues/2009'} 
issues = doc.xpath('//issues:*', namespaces=ns)

相當容易操縱的元素，例如從代碼中移除的命名空間，轉換與dict：

>>> dict((x.tag[len(ns['issues'])+2:], x.text) for x in issues) 
<<<  
{'closedDate': '2009-08-04T19:20:20.000Z', 
'id': '291', 
'label': 'Area-BrowserUI', 
'stars': '13', 
'state': 'closed', 
'status': 'Verified'}

來源

2011-07-03 18:19:21 zeekay

謝謝，但我一直比較喜歡JSON對象，因爲它們很容易轉換爲字典。 – Dexter

我更喜歡JSON，但有時你沒有選擇。 – zeekay

目前，我這樣做。 :-) – Dexter

JSON字符串解碼錯誤

回答

相關問題