如何使用線程或其他快速方式獲取網站？

我在這裏遇到了一些麻煩，以瞭解線程是如何工作的或者是如何構建的。如何使用線程或其他快速方式獲取網站？

我需要下載幾個網頁來更改鏈接（容易部分）上的一個值並獲取一些信息，但我使用的是'while'，需要大約1秒或更長時間才能下載類似60kb的網站。 ..我的互聯網5MB ..

有人可以提供我如何做類似的最簡單的例子嗎？

2012-01-06 Shady

 #!/usr/bin/env python 
     import Queue 
     import threading 
     import urllib2 
     import time 

     hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com", 
     "http://ibm.com", "http://apple.com"] 

     queue = Queue.Queue() 

     class ThreadUrl(threading.Thread): 
     """Threaded Url Grab""" 
     def __init__(self, queue): 
      threading.Thread.__init__(self) 
      self.queue = queue 

     def run(self): 
      while True: 
      #grabs host from queue 
      host = self.queue.get() 

      #grabs urls of hosts and prints first 1024 bytes of page 
      url = urllib2.urlopen(host) 
      print url.read(1024) 

      #signals to queue job is done 
      self.queue.task_done() 

     start = time.time() 
     def main(): 

     #spawn a pool of threads, and pass them queue instance 
     for i in range(5): 
      t = ThreadUrl(queue) 
      t.setDaemon(True) 
      t.start() 

     #populate queue with data 
      for host in hosts: 
      queue.put(host) 

     #wait on the queue until everything has been processed  
     queue.join() 

     main() 
     print "Elapsed Time: %s" % (time.time() - start)

來源

2012-01-06 21:03:24 PabloG

'threading.Thread .__ init __（self）'...是否與'from threading import Thread'一樣？ – Droogans 2012-01-06 22:39:56

那對我來說不簡單= / – Shady 2012-01-06 22:49:11

提取自Advanced Usage: Asynchronous Requests

from requests import async 

urls = [ 
    'http://python-requests.org', 
    'http://httpbin.org', 
    'http://python-guide.org', 
    'http://kennethreitz.com' 
] 

rs = [async.get(u) for u in urls] 
async.map(rs)

這是不使用線程，但工作原理是相同的 - 請求被同時製作。

來源

2012-01-06 23:13:06

如何使用線程或其他快速方式獲取網站？

回答

相關問題