我目前正在用Threading/workpool測試一些東西;我創建了400個線程,總共下載了5000個URLS ......問題是400個線程中的一些線程「凍結」,當查看我的進程時,我發現每個運行凍結時有15個線程,最終一段時間後關閉1由1.Python線程沒有完成
我的問題是如果有一種'計時器'/'計數器'殺死一個線程,如果它沒有完成後x秒的方式。
# download2.py - Download many URLs using multiple threads.
import os
import urllib2
import workerpool
import datetime
from threading import Timer
class DownloadJob(workerpool.Job):
"Job for downloading a given URL."
def __init__(self, url):
self.url = url # The url we'll need to download when the job runs
def run(self):
try:
url = urllib2.urlopen(self.url).read()
except:
pass
# Initialize a pool, 400 threads in this case
pool = workerpool.WorkerPool(size=400)
# Loop over urls.txt and create a job to download the URL on each line
print datetime.datetime.now()
for url in open("urls.txt"):
job = DownloadJob(url.strip())
pool.put(job)
# Send shutdown jobs to all threads, and wait until all the jobs have been completed
pool.shutdown()
pool.wait()
print datetime.datetime.now()
你有沒有進行一些分析,看是否400個線程確實提高你的表現?線程不是免費的;每個線程都有一定的開銷,理想的線程數量可能遠遠少於此。 –