我正在編寫一個逐行讀取大型文本文件的代碼,並找到以UNIQUE-ID(文件中有許多文件)開頭的行,它正好在某些行(在本例中,以'REACTION-LAYOUT - '開始並且字符串中的第5個元素爲OLEANDOMYCIN)。代碼如下:Python在文本文件中提取特定行
data2 = open('pathways.dat', 'r', errors = 'ignore')
pathways = data2.readlines()
PWY_ID = []
line_cont = []
L_PRMR = [] #Left primary
car = []
#i is the line number (first element of enumerate),
#while line is the line content (2nd elem of enumerate)
for i,line in enumerate(pathways):
if 'UNIQUE-ID' in line:
line_cont = line
PWY_ID_line = line_cont.rstrip()
PWY_ID_line = PWY_ID_line.split(' ')
PWY_ID.append(PWY_ID_line[2])
elif 'REACTION-LAYOUT -' in line:
L_PWY = line.rstrip()
L_PWY = L_PWY.split(' ')
L_PRMR.append(L_PWY[4])
elif 'OLEANDOMYCIN' in line:
car.append(PWY_ID)
print(car)
但是,輸出是不是所有包含PWY_ID(第一if語句的輸出),就像是忽略了代碼的所有其餘的行。任何人都可以幫忙嗎?
編輯
下面是我的數據樣本(也有像我的文本文件1000-ISH類似的 「頁面」):
//
UNIQUE-ID - PWY-741
.
.
.
.
PREDECESSORS - (RXN-663 RXN-662)
REACTION-LAYOUT - (RXN-663 (:LEFT-PRIMARIES CPD-1003) (:DIRECTION :L2R) (:RIGHT-PRIMARIES CPD-1004))
REACTION-LAYOUT - (RXN-662 (:LEFT-PRIMARIES CPD-1002) (:DIRECTION :L2R) (:RIGHT-PRIMARIES CPD-1003))
REACTION-LAYOUT - (RXN-661 (:LEFT-PRIMARIES CPD-1001) (:DIRECTION :L2R) (:RIGHT-PRIMARIES CPD-1002))
REACTION-LIST - RXN-663
REACTION-LIST - RXN-662
REACTION-LIST - RXN-661
SPECIES - TAX-351746
SPECIES - TAX-644631
SPECIES - ORG-6335
SUPER-PATHWAYS - PWY-5266
TAXONOMIC-RANGE - TAX-1224
//
你能從你的文本文件發表幾行嗎? – anupsabraham
你可以舉一些例子數據 – Matt
不知道我是否理解這個問題;你是否正在尋找一個特定的行,在這三個條件都是真的?所以有'獨特ID','REACTION-LAYOUT - '和'OLEANDOMYCIN'的線? –