我在使用cPython進行線程時遇到同步問題。我有兩個文件,我解析它們並返回所需的結果。但是,下面的代碼奇怪地起作用,並且返回三次而不是兩次,並且按照我將它們放入隊列的順序不返回。下面的代碼:python多線程同步
import Queue
import threading
from HtmlDoc import Document
OUT_LIST = []
class Threader(threading.Thread):
"""
Start threading
"""
def __init__(self, queue, out_queue):
threading.Thread.__init__(self)
self.queue = queue
self.out_queue = out_queue
def run(self):
while True:
if self.queue.qsize() == 0: break
path, host = self.queue.get()
f = open(path, "r")
source = f.read()
f.close()
self.out_queue.put((source, host))
self.queue.task_done()
class Processor(threading.Thread):
"""
Process threading
"""
def __init__(self, out_queue):
self.out_queue = out_queue
self.l_first = []
self.f_append = self.l_first.append
self.l_second = []
self.s_append = self.l_second.append
threading.Thread.__init__(self)
def first(self, doc):
# some code to to retrieve the text desired, this works 100% I tested it manually
def second(self, doc):
# some code to to retrieve the text desired, this works 100% I tested it manually
def run(self):
while True:
if self.out_queue.qsize() == 0: break
doc, host = self.out_queue.get()
if host == "first":
self.first(doc)
elif host == "second":
self.second(doc)
OUT_LIST.extend(self.l_first + self.l_second)
self.out_queue.task_done()
def main():
queue = Queue.Queue()
out_queue = Queue.Queue()
queue.put(("...first.html", "first"))
queue.put(("...second.html", "second"))
qsize = queue.qsize()
for i in range(qsize):
t = Threader(queue, out_queue)
t.setDaemon(True)
t.start()
for i in range(qsize):
dt = Processor(out_queue)
dt.setDaemon(True)
dt.start()
queue.join()
out_queue.join()
print '<br />'.join(OUT_LIST)
main()
現在,當我打印,我想打印的「第一」首先,然後的「第二」內容的內容。誰能幫我?
注:我是線程,因爲實際上我將不得不一次連接超過10個地方並檢索其結果。我相信線程是完成這樣的任務的最合適的方式
由於過度使用同步問題以及在需要在多臺服務器上運行腳本時可能具有較高的平行度級別,因此我優先考慮多線程的多進程。 – varela
I/O綁定的線程很少受GIL和上下文切換影響。線程比進程消耗更少的資源,在這種情況下,這看起來很好。 – Martin