我怎樣才能平行解析python？

我有下面的代碼，其將圖形從邊列表鄰接矩陣：我怎樣才能平行解析python？

for line in open('graph.txt'): 
    converted = [sparse_to_dense.get(int(ID)) for ID in line.split()] 
    i = converted[0] 
    j = converted[1] 
    I.append(i) 
    J.append(j) 
n = max([max(I), max(J)]) + 1 
data = [1]*len(I) 
return coo_matrix((data, (I,J)), shape=(n,n), dtype='i1')

此代碼是非常緩慢 - 上的500K邊緣可以機轉換髮生小時。另一方面，I/O顯然不是瓶頸（我幾乎可以瞬間讀完內存中的完整文件），所以我認爲有一個並行的空間。但我不知道如何繼續：我應該平行讀文件嗎？

來源

2014-09-12 Moonwalker

[爲什麼不考慮線程？]（http://www.tutorialspoint.com/python/python_multithreading.htm） – heinst 2014-09-12 13:31:33

@heinst我可能是錯的，但解決了穿線問題是IO是瓶頸，大部分時間程序等待IO。在我的情況下，程序吃了100％的一個cpu，io在這裏可以忽略不計。 – Moonwalker 2014-09-12 13:39:23

我想你關心你對我和J.的正確順序？ – gosom 2014-09-12 13:42:20

使用多重處理的一種方法就是這樣做。我沒有檢查，並可以進一步改進

import multiprocessing 


class Worker(multiprocessing.Process): 

    def __init__(self, queue, results): 
     multiprocessing.Process.__init__(self): 
     self.q = queue 
     self.results = results 

    def run(self): 
     while True: 
      try: 
       lineno, linecontents = self.q.get(block=False) 
      except Queue.Empty: 
       break 
      converted = [sparse_to_dense.get(int(ID)) for ID in line.split()] 
      i = converted[0] 
      j = converted[1] 
      self.results.put((i, j)) 


def main(): 
    q = multiprocessing.Queue() 
    results = multiprocessing.JoinableQueue() 

    for i, l in open(fname): 
     q.put((i, l)) 

    for _ in xrange(4): 
     w = Worker(q, results) 
     w.start() 

    I, J = [] 
    while True: 
     try: 
      i, j = results.get(block=False) 
     except Queue.Empty: 
     break 
    I.append(i) 
    J.append(j) 
    results.task_done() 

    results.join() 

    n = max([max(I), max(J)]) + 1 
    data = [1]*len(I) 
    coo = coo_matrix((data, (I,J)), shape=(n,n), dtype='i1')

來源

2014-09-12 13:54:05 gosom

'results.get（）'永遠不會引發'Queue.Empty'異常。它會在'results'爲空時阻止。你需要使用'get_nowait（）'來代替。 'main'中的縮進似乎也是關閉的。 – dano 2014-09-12 15:43:58

我怎樣才能平行解析python？

回答

相關問題