提高python多線程下載網頁的性能

我正在嘗試編寫使用單獨線程下載網頁的python代碼。下面是我的代碼示例：提高python多線程下載網頁的性能

import urllib2 
from threading import Thread 
import time 

URLs = ['http://www.yahoo.com/', 
     'http://www.time.com/', 
     'http://www.cnn.com/', 
     'http://www.slashdot.org/' 
     ] 


def thread_func(arg): 
    t = time.time() 
    page = urllib2.urlopen(arg) 
    page = page.read() 
    print time.time() - t 




for url in URLs: 
    t = Thread(target = thread_func, args = (url,)) 
    t.start() 
    t.join()

我運行代碼和線程似乎串行執行，如果我沒有記錯，與實測的下載時間，但每一個都是輸出後到控制檯一定的時間。我是否正確地編碼？

來源

2014-09-18 Lehel

對t.join()的調用會阻止當前線程，直到目標線程結束。您在創建線程後立即調用它，因此您一次只能運行一個以下的下載器線程。

你的代碼改成這樣：

threads = [] 
for url in URLs: 
    t = Thread(target = thread_func, args = (url,)) 
    t.start() 
    threads.append(t) 

# All threads started, now wait for them to finish 
for t in threads: 
    t.join()

來源

2014-09-18 22:41:28

感謝。只是出於好奇，我得到的輸出是這樣的：3.60282206535 4.05780601501 5.74620199203 9.5616710186 ...它看起來每次都花費更長的時間，而不是他們在同一時間。這是正確的行爲？ ... – Lehel 2014-09-18 22:50:39

線程將大致在同一時間開始，但將爭奪您的網絡帶寬。有幾個線程會立即開始提出請求，而其他線程將阻塞，直到網絡可用。 – 2014-09-18 23:41:29

提高python多線程下載網頁的性能

回答

相關問題