在python中過濾出測試文件中的一些奇怪行

-1

我有一個非常大的文本文件，我想在其中獲得一些行。第一行是標識符之後是多行（在不同的行數）這樣的例子：在python中過濾出測試文件中的一些奇怪行

例如：

fixedStep ch=GL000219.1 start=52818 step=1 
1.000000 
1.000000 
1.000000 
1.000000 
1.000000 
1.000000 
1.000000 
fixedStep ch=GL000219.1 start=52959 step=1 
1.000000 
1.000000 
1.000000 
fixedStep ch=M start=52959 step=1 
1.000000 
1.000000

這條線是標識符：fixedStep ch=GL000219.1 start=52818 step=1 我想篩選出包含所有標識符線「ch=GL000219.1」和下面的行（數字），並在其下面保留其他標識符和相應的行（數字）。像這樣的輸出：

fixedStep ch=M start=52959 step=1 
1.000000 
1.000000

你知道如何在Python中做到這一點？

來源

2017-05-31 ARM

你有什麼嘗試到目前爲止？ – TheDarkKnight

通過列表檢查文件是否符合conditions，您可以將文件讀入list和loop。例如：

with open('test.txt', 'r') as f: 
    with open('test2.txt', 'w') as w: 
     data = f.read().splitlines() 
     for i in xrange(len(data)): 
      if data[i].startswith('fixedStep') and 'ch=GL000219.1' not in data[i]: 
       w.write(data[i] + '\n') 
       for t in xrange(i+1, len(data)): 
        if data[t].startswith('fixedStep') is False: 
         w.write(data[t] + '\n') 
        else: 
         break

輸出：

fixedStep ch=M start=52959 step=1 
1.000000 
1.000000

來源

2017-05-31 09:14:59

謝謝，但它不會使文件只是打印。 – ARM

@ARM只是用寫代替印刷，我編輯了答案，你現在可以檢查它。 –

如果它是一個大的文件，它可能是更有效地處理它沒有全讀入內存：

with open('file.txt', 'r') as f: 
    with open('outfile.txt', 'w') as outfile: 
     good_data = True 
     for line in f: 
      if line.startswith('fixedStep'): 
       good_data = 'ch=GL000219.1' not in line 
      if good_data: 
       outfile.write(line)

來源

2017-05-31 09:27:11

我想把它也放在一個新文件中 – ARM

根據你的願望更正:-) –

給出這個錯誤：NameError：name'good_data'沒有被定義 – ARM

在python中過濾出測試文件中的一些奇怪行

回答

相關問題