Tweepy api限制解決方案

我正在嘗試下載芝加哥地區的一些Twitter數據，專門專注於犯罪相關推文。我也需要用座標進行地理標記。我希望獲得足夠的分析用途，但REST API是有限的，因此將其限制爲相當低的數量。我一直在嘗試爲此產生一個解決方法，基於類似的問題Avoid twitter api limitation with Tweepy但是到目前爲止我沒有太多的運氣。任何人都可以幫助我嗎？我是所有這種東西的新手，所以任何幫助將非常感激。理想情況下，我也需要熊貓數據框。我一直在使用以下教程作爲編碼的基礎。這可以在以下網址找到： http://www.karambelkar.info/2015/01/how-to-use-twitters-search-rest-api-most-effectively./ 我複製的代碼我有以下：Tweepy api限制解決方案

import tweepy 
auth = tweepy.AppAuthHandler('', '') 
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True) 
if (not api): 
print ("Can't Authenticate") 
sys.exit(-1) 

import sys 
import jsonpickle 
import os 



searchQuery = 'shooting OR stabbing OR violence OR assualt OR attack OR homicide OR punched OR mugging OR murder' 
geocode= "41.8781,-87.6298,15km" 


maxTweets = 1000000 
tweetsPerQry = 100 
fName = 'tweets.txt' 
sinceId = None 
max_id = 1L 
tweetCount = 0 
print ("Downloading max {0} tweets".format(maxTweets)) 
with open (fName, 'w') as f: 
    while tweetCount < maxTweets: 
    try: 
     if (max_id <= 0): 
      if(not sinceId): 
       new_tweets = api.search(q=searchQuery, geocode=geocode, count=tweetsPerQry) 
      else: 
       new_tweets = api.search(q=searchQuery, geocode=geocode, count=tweetsPerQry, since_id=sinceID) 
     else: 
      if (not sinceId): 
       new_tweets = api.search(q=searchQuery, geocode=geocode, count=tweetsPerQry, max_id=str(max_id-1)) 
      else: 
       new_tweets = api.search(q=searchQuery, geocode=geocode, count=tweetsPerQry, max_id=str(max_id-1), since_id=sinceId) 
     if not new_tweets: 
      print ("No more tweets found") 
      break 
     for tweet in new_tweets: 
      f.write(jsonpickle.encode(tweet._json, unpicklable=False)+'\n') 
     tweetCount += len(new_tweets) 
     print("Downloaded {0} tweets".format(tweetCount)) 
     max_id = new_tweets[-1].id 
    except tweepy.TweepError as e: 
     print("some error : " + str(e)) 
     break 
print ("Downloaded {0} tweets, Saved to {1}".format(tweetCount, fName))

來源

2016-05-08 Michael Montgomery

*「沒有太多的運氣」*在這種情況下意味着什麼？錯誤？意外的行爲？請給[mcve]（並且在未來儘量不要共享您的API令牌）。 – jonrsharpe

感謝您的回覆如此之快！對不起，這是我的一個疏忽。 Thankyou爲了消除這一點。就沒有太多的運氣而言，它好像掛起就好像它正在處理，但是當我檢查我的文本文件時沒有任何內容，但我期望這會在運行一段時間後至少有一些數據。 –

我沒有收到任何錯誤消息，只是爲了說明 –

運行到同樣的問題後，我創建識別即將發生的API速率限制的方法。這個python代碼使用tweepy，它將打印發出的API請求數和剩餘的允許請求數。您可以添加自己的代碼，以便在達到限制之前或之後延遲/休眠/等待或使用tweepy wait_on_rate_limit（更多詳細信息HERE）。

Example output:

Twitter API: 3 requests used, 177 remaining, for API queries to /search/tweets

Twitter API: 3 requests used, 177 remaining, for API queries to /application/rate_limit_status

api = tweepy.API(auth) 


#Twitter's words on API limits https://support.twitter.com/articles/15364 

#### Define twitter rate determining loop 
def twitter_rates(): 
    stats = api.rate_limit_status() #stats['resources'].keys() 
    for akey in stats['resources'].keys(): 
     if type(stats['resources'][akey]) == dict: 
      for anotherkey in stats['resources'][akey].keys(): 
       if type(stats['resources'][akey][anotherkey]) == dict: 
        #print(akey, anotherkey, stats['resources'][akey][anotherkey]) 
        limit = (stats['resources'][akey][anotherkey]['limit']) 
        remaining = (stats['resources'][akey][anotherkey]['remaining']) 
        used = limit - remaining 
        if used != 0: 
         print("Twitter API used", used, "remaining queries", remaining,"for query type", anotherkey) 
        else: 
         pass 
       else: 
        pass #print("Passing") #stats['resources'][akey] 
     else: 
      print(akey, stats['resources'][akey]) 
      print(stats['resources'][akey].keys()) 
      limit = (stats['resources'][akey]['limit']) 
      remaining = (stats['resources'][akey]['remaining']) 
      used = limit - remaining 
      if used != 0: 
       print("Twitter API:", used, "requests used,", remaining, "remaining, for API queries to", akey) 
       pass 


twitter_rates()

還要注意的是wait_on_rate_limit「將停止例外。Tweepy會睡眠但是長期需要的速率限制來補充。」 Aaron Hill 2014年7月，HERE是一個Stackoverflow頁面，其中有更多評論。

來源

2017-10-17 00:10:12 Antoine

Tweepy api限制解決方案

回答

相關問題