2012-01-06 16 views
1

我在這裏遇到了一些麻煩,以瞭解線程是如何工作的或者是如何構建的。如何使用線程或其他快速方式獲取網站?

我需要下載幾個網頁來更改鏈接(容易部分)上的一個值並獲取一些信息,但我使用的是'while',需要大約1秒或更長時間才能下載類似60kb的網站。 ..我的互聯網5MB ..

有人可以提供我如何做類似的最簡單的例子嗎?

回答

0

here

 #!/usr/bin/env python 
     import Queue 
     import threading 
     import urllib2 
     import time 

     hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com", 
     "http://ibm.com", "http://apple.com"] 

     queue = Queue.Queue() 

     class ThreadUrl(threading.Thread): 
     """Threaded Url Grab""" 
     def __init__(self, queue): 
      threading.Thread.__init__(self) 
      self.queue = queue 

     def run(self): 
      while True: 
      #grabs host from queue 
      host = self.queue.get() 

      #grabs urls of hosts and prints first 1024 bytes of page 
      url = urllib2.urlopen(host) 
      print url.read(1024) 

      #signals to queue job is done 
      self.queue.task_done() 

     start = time.time() 
     def main(): 

     #spawn a pool of threads, and pass them queue instance 
     for i in range(5): 
      t = ThreadUrl(queue) 
      t.setDaemon(True) 
      t.start() 

     #populate queue with data 
      for host in hosts: 
      queue.put(host) 

     #wait on the queue until everything has been processed  
     queue.join() 

     main() 
     print "Elapsed Time: %s" % (time.time() - start) 
+0

'threading.Thread .__ init __(self)'...是否與'from threading import Thread'一樣? – Droogans 2012-01-06 22:39:56

+1

那對我來說不簡單= / – Shady 2012-01-06 22:49:11

3

提取自Advanced Usage: Asynchronous Requests

from requests import async 

urls = [ 
    'http://python-requests.org', 
    'http://httpbin.org', 
    'http://python-guide.org', 
    'http://kennethreitz.com' 
] 

rs = [async.get(u) for u in urls] 
async.map(rs) 

這是不使用線程,但工作原理是相同的 - 請求被同時製作。

相關問題