2015-07-13 43 views
0

我第一次使用線程類中的線程,並且它們在函數運行後似乎沒有被釋放。我試圖一次運行最多5個線程。由於一個線程創建下一個會有一些重疊,但我看到2000+線程同時運行,然後我得到異常「無法啓動新線程」。Python釋放線程

from threading import Thread 
import string 

URLS = ['LONG LIST OF URLS HERE'] 

currentThread = 0 
LASTTHREAD = len(URLS) - 1 
MAXTHREADS = 5 
threads = [None] * (LASTTHREAD + 1) 

def getURL(threadName, currentThread): 
    print('Thread Name = ' + threadName) 
    print('URL = ' + str(URLS[currentThread])) 
    if currentThread < LASTTHREAD: 
    currentThread = currentThread + 1 
    thisThread = currentThread 
    try: 
     threads[thisThread] = Thread(target = getURL, args = ('thread' + str(thisThread), currentThread,)) 
     threads[thisThread].start() 
     threads[thisThread].join() 
    except Exception,e: 
     print "Error: unable to start thread" 
     print str(e) 

for i in range(0, MAXTHREADS): 
    currentThread = currentThread + 1 
    try: 
    threads[i] = Thread(target = getURL, args = ('thread' + str(i), currentThread,)) 
    threads[i].start() 
    threads[i].join() 
    except Exception,e: 
    print "Error: unable to start thread" 
    print str(e) 

我接受任何其他清理我可以在這裏做,以及因爲我是很新,Python和全新的穿線。我只是試圖在此時正確設置線程。最終這將刮擦URLS。

+0

讓你的衍生線程自己產生線程是相當不尋常的。我建議在最低限度的重構,以便您的主線程完成所有的產卵。 – eddiewould

回答

0

我建議看看線程池,讓線程從合適的共享數據結構(例如隊列)中獲取任務,而不是始終開始新線程。

取決於什麼是你真正想做的事,如果你使用CPython的(如果你不知道我的意思CPython中,你會),你可能沒有真正得到使用線程的任何性能改進(由於全球解釋器鎖定)。所以你可能最好查看多處理模塊。

from Queue import Queue 
from threading import Thread 

def worker(): 
    while True: 
     item = q.get() 
     do_work(item) 
     q.task_done() 

def do_work(url): 
    print "Processing URL:" + url 

q = Queue() 
for i in range(5): 
    t = Thread(target=worker) 
    t.daemon = True 
    t.start() 

for item in ['url_' + str(i) for i in range(2000)]: 
    q.put(item) 

q.join()  # block until all tasks are done 
+0

請參閱文檔中的示例:https://docs.python.org/2/library/queue.html(位於底部) – eddiewould