2016-03-01 33 views
0

我試圖從tweepy的twitter api收集大量推文。我有一個包含大約一萬個推特ID的文本文件。我的程序通過文件讀取,抓取每條推文,以及它正在回覆的推文。然後它將每條推文的文本以及每位作者的用戶名保存在相應的文本文件中。下面的代碼:'RuntimeError:在使用Tweepy收集推文時cmp'中超出最大遞歸深度

auth = tweepy.OAuthHandler(ckey, csecret) 
auth.set_access_token(atoken, asecret) 

api = tweepy.API(auth) 

tweetsFile = open("srcstic.txt", "r") 
tweets_seen = set() # holds tweets already seen 

def getNextLine(): 
    while True: 

     tweetID = tweetsFile.readline() 
     getTweetObj(tweetID) 

     if not tweetID: 
      break 

def getTweetObj(tweetID): 
    try: 
     tweetObj = api.get_status(tweetID) 
     #sleep(11) 
    except tweepy.error.TweepError: 
     getNextLine() 
    else: 
     pass 
    tweet = tweetObj.text.encode("utf8") 
    idnum = tweetObj.in_reply_to_status_id_str 

    try: 
     former = api.get_status(idnum) 
    except tweepy.error.TweepError: 
     getNextLine() 
    else: 
     printFiles(former, tweetObj, tweet) 

def printFiles(former, tweetObj, tweet): 

    callUserName = former.user.screen_name 
    responseUserName = tweetObj.user.screen_name 

    if tweet not in tweets_seen: 

     with open("callauthors.txt", "a") as callauthors: 
       cauthors = callUserName + "\n" 
       callauthors.write(cauthors) 

     with open("responseauthors.txt", "a") as responseauthors: 
       rauthors = responseUserName + "\n" 
       responseauthors.write(rauthors) 

     with open("response_tweets.txt", "a") as responsetweets: 
       output = (tweetObj.text.encode('utf-8')+"\n") 
       responsetweets.write(output) 

     with open("call_tweets.txt", "a") as calltweets: 
       output = (former.text.encode('utf-8')+"\n") 
       calltweets.write(output) 
       tweets_seen.add(tweet) 

    getNextLine() 

然而,所有了一會兒,然後我得到以下錯誤正常工作:

File "gettweets2.py", line 68, in <module> 
getNextLine() 
    File "gettweets2.py", line 21, in getNextLine 
getTweetObj(tweetID) 
    File "gettweets2.py", line 40, in getTweetObj 
getNextLine() 
    File "gettweets2.py", line 21, in getNextLine 
getTweetObj(tweetID) 
    File "gettweets2.py", line 31, in getTweetObj 
getNextLine() 
    File "gettweets2.py", line 21, in getNextLine 
getTweetObj(tweetID) 
    File "gettweets2.py", line 31, in getTweetObj 
getNextLine() 
    File "gettweets2.py", line 21, in getNextLine 
getTweetObj(tweetID) 
    File "gettweets2.py", line 31, in getTweetObj 
getNextLine() 
........ 
........ 
    File   "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/_ abcoll.py", line 540, in update 
if isinstance(other, Mapping): 
    File  "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/abc.py", line 141, in __instancecheck__ 
subtype in cls._abc_negative_cache): 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/_weakrefset.py", line 73, in __contains__ 
return wr in self.data 
RuntimeError: maximum recursion depth exceeded in cmp 

任何想法怎麼回事錯在這裏? 謝謝。

回答

0

您只能在得到該錯誤後遞歸地調用999次函數。你可以使用條件語句從函數的外部調用或創建一個生成器。

0

盡我所能讀到這一點,讀取錯誤可以讓你進入無限遞歸,因爲每個例程都會調用另一個。如果獲取下一行不會讓您偏離錯誤條件,那麼您將在不到一秒的時間內超過堆棧限制。

如果不出意外,做一個快速檢查與一對夫婦的打印語句:打印出tweetIDs你遇到他們,標記識別打印位置。直接修復可能包括編寫抓取原始推文的第二個例程,但無法重現。這假定你只需要當前推文的直接父母,而不是整個鏈。

相關問題