POPEN在主，子

下面的代碼（在主線程）效果很好，我用grep一些文件，直到第100個結果發現搜索（結果寫入到文件），然後退出：POPEN在主，子

command = 'grep -F "%s" %s*.txt' % (search_string, DATA_PATH) p = Popen(['/bin/bash', '-c', command], stdout = PIPE) f = open(output_file, 'w+') num_lines = MAX_RESULTS while True: line = p.stdout.readline() print num_lines if line != '': f.write(line) num_lines = num_lines - 1 if num_lines == 0: break else: break

非常相同的代碼使用到進程的子類，始終返回控制檯grep: writing output: Broken pipe：

class Search(Process): def __init__(self, search_id, search_string): self.search_id = search_id self.search_string = search_string self.grepped = '' Process.__init__(self) def run(self): output_file = TMP_PATH + self.search_id # flag if no regex chars flag = '-F' if re.match(r"^[a-zA-Z0\ ]*$", self.search_string) else '-P' command = 'grep %s "%s" %s*.txt' % (flag, self.search_string, DATA_PATH) p = Popen(['/bin/bash', '-c', command], stdout = PIPE) f = open(output_file, 'w+') num_lines = MAX_RESULTS while True: line = p.stdout.readline() print num_lines if line != '': f.write(line) num_lines = num_lines - 1 if num_lines == 0: break else: break

怎麼來的？如何解決這個問題？

來源

2012-01-28 pistacchio

爲什麼你使用grep而Python有非常堅實的[正則表達式]（http://docs.python.org/library/re.html）解決方案本身？ – orlp 2012-01-28 10:34:24

，因爲我必須搜索1.5+ Gb的數據，並且grep的速度與python不匹配。 – pistacchio 2012-01-28 10:46:17

看起來像這裏一樣的問題：http://stackoverflow.com/questions/2595602/pythons-popen-cleanup當我有捕獲命令輸出的問題時，我增加了緩衝區的大小：'p = Popen （['/ bin/bash'，'-c'，command]，stdout = PIPE，bufsize = 256 * 1024 * 1024）' – hughdbrown 2012-01-28 14:34:59

我可以重現該錯誤信息是這樣的：

import multiprocessing as mp 
import subprocess 
import shlex 

def worker(): 
    proc = subprocess.Popen(shlex.split(''' 
     /bin/bash -c "grep -P 'foo' /tmp/test.txt" 
     '''), stdout = subprocess.PIPE) 
    line = proc.stdout.readline() 
    print(line) 
    # proc.terminate() # This fixes the problem 

if __name__=='__main__': 
    N = 6000 
    with open('/tmp/test.txt', 'w') as f: 
     f.write('bar foo\n'*N) # <--- Increasing this number causes grep: writing output: Broken pipe 
    p = mp.Process(target = worker) 
    p.start() 
    p.join()

如果上面的代碼不會產生錯誤的你，通過增加N增加文件/tmp/test.txt的大小。（相反，您可以通過減少N來隱藏代碼中存在錯誤的事實。）

如果工作進程在grep子進程之前結束，那麼grep將獲取一個SIGPIPE，告訴它它的stdout已關閉。 grep通過打印迴應

grep: writing output: Broken pipe

給標準錯誤，對於每一行它仍在處理一次。

修復方法是在worker結束之前終止proc.terminate()的過程。

來源

2012-01-28 11:00:43 unutbu

爲什麼我的過程會在grep之前結束，如果我有一個從grep本身讀取的無限循環？ – pistacchio 2012-01-28 11:03:30

你的while循環在一次迭代之後被打斷。 – unutbu 2012-01-28 11:04:43

回答

相關問題