2016-08-20 76 views
0

我想抓取twitter數據,但我得到這個錯誤。它如何解決它?我已經通過互聯網提到了一些疑問,但不能涉及該計劃。TweepError:無法解析JSON有效負載:

代碼:

import tweepy 
from tweepy import Stream 
from tweepy import OAuthHandler 
from tweepy.streaming import StreamListener 
import pandas as pd 
import json 
import csv 
import sys 
import time 

reload(sys) 
sys.setdefaultencoding('utf8') 

ckey = 'abc' 
csecret = 'abc' 
atoken = 'abc' 
asecret = 'abc' 

OAUTH_KEYS = {'consumer_key':ckey, 'consumer_secret':csecret, 'access_token_key':atoken, 'access_token_secret':asecret} 
auth = tweepy.OAuthHandler(OAUTH_KEYS['consumer_key'], OAUTH_KEYS['consumer_secret']) 

api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True) 
if (not api): 
    print ("Can't Authenticate") 
    sys.exit(-1) 
else: 
    print " Scraping data now" # Enter latitude & longitude and then radius in Kms q='hello'geocode="19.9974533,73.7898023,1000km" 
    cursor = tweepy.Cursor(api.search,q='olympics',since='2016-08-18',until='2016-08-19',lang='en',count=1000) 
    results=[] 
    for item in cursor.items(10000): # Remove the limit to 1000 

     results.append(item) 

def toDataFrame(tweets): 
    # COnvert to data frame 
    DataSet = pd.DataFrame() 

    DataSet['tweetID'] = [tweet.id for tweet in tweets] 
    DataSet['tweetText'] = [tweet.text.encode('utf-8') for tweet in tweets] 
    DataSet['tweetRetweetCt'] = [tweet.retweet_count for tweet in tweets] 
    DataSet['tweetFavoriteCt'] = [tweet.favorite_count for tweet in tweets] 
    DataSet['tweetSource'] = [tweet.source for tweet in tweets] 
    DataSet['tweetCreated'] = [tweet.created_at for tweet in tweets] 
    DataSet['userID'] = [tweet.user.id for tweet in tweets] 
    DataSet['userScreen'] = [tweet.user.screen_name for tweet in tweets] 
    DataSet['userName'] = [tweet.user.name for tweet in tweets] 
    DataSet['userCreateDt'] = [tweet.user.created_at for tweet in tweets] 
    DataSet['userDesc'] = [tweet.user.description for tweet in tweets] 
    DataSet['userFollowerCt'] = [tweet.user.followers_count for tweet in tweets] 
    DataSet['userFriendsCt'] = [tweet.user.friends_count for tweet in tweets] 
    DataSet['userLocation'] = [tweet.user.location for tweet in tweets] 
    DataSet['userTimezone'] = [tweet.user.time_zone for tweet in tweets] 
    DataSet['Coordinates'] = [tweet.coordinates for tweet in tweets] 
    DataSet['GeoEnabled'] = [tweet.user.geo_enabled for tweet in tweets] 
    DataSet['Language'] = [tweet.user.lang for tweet in tweets] 
    tweets_place= [] 
    #users_retweeted = [] 
    for tweet in tweets: 
     if tweet.place: 
      tweets_place.append(tweet.place.full_name) 
     else: 
      tweets_place.append('null') 
    DataSet['TweetPlace'] = [i for i in tweets_place] 
    #DataSet['UserWhoRetweeted'] = [i for i in users_retweeted] 

    return DataSet 

print "started writing the output"  
DataSet = toDataFrame(results) 
DataSet.to_csv('olympics_18_8.csv',index=False) 
print "Download Completed" 

我尋求改變個碼可行的,沒有任何錯誤的幫助。如果tweepy是這裏的問題,那麼我可以使用tweethyon如果是的話我該如何改變代碼以避免錯誤並下載轉儲。

錯誤:

Traceback (most recent call last): 
    File "Scrape_lat_lon.py", line 30, in <module> 
    for item in cursor.items(10000): # Remove the limit to 1000 
    File "/usr/local/lib/python2.7/dist-packages/tweepy/cursor.py", line 197, in next 
    self.current_page = self.page_iterator.next() 
    File "/usr/local/lib/python2.7/dist-packages/tweepy/cursor.py", line 117, in next 
    model = ModelParser().parse(self.method(create=True), data) 
    File "/usr/local/lib/python2.7/dist-packages/tweepy/parsers.py", line 95, in parse 
    json = JSONParser.parse(self, method, payload) 
    File "/usr/local/lib/python2.7/dist-packages/tweepy/parsers.py", line 54, in parse 
    raise TweepError('Failed to parse JSON payload: %s' % e) 
tweepy.error.TweepError: Failed to parse JSON payload: Unterminated string starting at: line 1 column 467050 (char 467049) 

欣賞提前幫助。

+0

你可以在你的文章中添加錯誤日誌嗎? –

+0

@KostasPelelis更新時間 –

回答

1

我有同樣的錯誤:「無法解析JSON負載:未終止的字符串從...開始...」。問題是在python的json,安裝simplejson後問題解決了,沒有更多的錯誤。

我的代碼沒有變化,只是安裝了simplejson。

我這樣做過:pip install simplejson