2012-10-05 63 views
2

我使用TweetStream(https://github.com/joshmarshall/TweetStream),基於龍捲風嘰嘰喳喳流模塊來監視流API。如何在獲取Twitter流API時重啓龍捲風中的ioloop?

我想知道我可以重新開始,如果要更改跟蹤的話抓取過程。

我當前的解決方案(不完全的解決方案)是給我一些錯誤。

stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop) 

stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback) 


def check_words(): 
    global words 
    with open('words.txt') as file: 
     newwords = file.read() 
     if words != newwords: 
      words = newwords 
     try: 
      print newwords 
      stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback) 
     except: 
      pass 
     file.close() 

interval_ms = 1000*10 
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop) 
scheduler.start() 
main_io_loop.start() 

這裏是我得到

ERROR:root:Uncaught exception, closing connection. 
Traceback (most recent call last): 
    File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 305, in wrapper 
    callback(*args) 
    File "/home/user/PycharmProjects/observrenv/src/tweetstream/tweetstream.py", line 155, in on_connect 
    self._twitter_stream.read_until("\r\n\r\n", self.on_headers) 
    File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 151, in read_until 
    self._set_read_callback(callback) 
    File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 369, in _set_read_callback 
    assert not self._read_callback, "Already reading" 
AssertionError: Already reading 
ERROR:root:Exception in callback <tornado.stack_context._StackContextWrapper object at 0x2415cb0> 
Traceback (most recent call last): 
    File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/ioloop.py", line 421, in _run_callback 
    callback() 
    File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 305, in wrapper 
    callback(*args) 
    File "/home/user/PycharmProjects/observrenv/src/tweetstream/tweetstream.py", line 155, in on_connect 
    self._twitter_stream.read_until("\r\n\r\n", self.on_headers) 
    File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 151, in read_until 
    self._set_read_callback(callback) 
    File "/home/user/PycharmProjects/observrenv/local/lib/python2.7/site-packages/tornado/iostream.py", line 369, in _set_read_callback 
    assert not self._read_callback, "Already reading" 
AssertionError: Already reading 

錯誤我取得了較好成績(沒有最好的)通過調用check_words當再次啓動ioloop。

stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop) 

stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback) 


def check_words(): 
    global words, stream 
    with open('words.txt') as file: 
     newwords = file.read() 
    if words != newwords: 
     words = newwords 
     print newwords 
     try: 
      stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop) 
      stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback) 
      interval_ms = 1000*10 
      scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop) 
      scheduler.start() 
      main_io_loop.start() 
     except: 
      pass 
     file.close() 


interval_ms = 1000*10 
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop) 
scheduler.start() 
main_io_loop.start() 

回答

1

正如有人說here通過Twitter的員工,推薦的是做什麼我已經在做(但更慢化的方式)。如果您的查詢條件發生變化,只需重新連接一次即可。否則,請保持連接打開。監視twitter可能發送給你的錯誤或者你可能被禁止也是很重要的。

0

看起來像你缺少Streaming API的主要想法。 連接到它永久打開。

stream = tweetstream.TweetStream(configuration,ioloop=main_io_loop) 

#What you are doing in callback? 
stream.fetch("/1.1/statuses/filter.json?track="+tornado.escape.url_escape(words), callback=callback) 


def check_words(): 
    #I guess, don't do it at all. 
    #global words 
    #with open('words.txt') as file: 
    # newwords = file.read() 
    # if words != newwords: 
    #  words = newwords 
    # try: 
    #  #Don't open new stream here 
    #  print newwords 
    # except: 
    #  pass 
    # file.close() 
    pass 

interval_ms = 1000*10 
scheduler = tornado.ioloop.PeriodicCallback(check_words,interval_ms,io_loop = main_io_loop) 
scheduler.start() 
main_io_loop.start() 

通過分析您的代碼,我認爲您只需在回調中使用新詞做例行程序。

+0

但如果我需要改變(我需要監控換句話說)我該如何改變這種狀況?如果我在監視「愛情」,然後用戶想要監視「憎恨」,我相信我應該改變它的「愛,恨」。我正在考慮停止io_loop,然後重新開始。 – gawry

+0

@gawry,最好的解決方案 - 流「愛,恨」,並在回調決定你有什麼樣的流字。對上面實現的模式進行模式化 - 將導致禁止,因爲流將積極地重新連接。如果由於某些原因,您必須一次監測兩個單詞,每個單詞一個單詞 - 使用2個線程。是的,如果你改變了話 - 重新啓動線程。 –

+0

所以我相信我可以將定期呼叫設置爲不頻繁的時段以避免被禁止。 – gawry