2011-12-02 22 views
1

我堅持在這裏的邏輯......我不得不從看起來像這樣如何從Python中的文件中提取特定的一組值?

AAA 
+-------------+------------------+ 
|   ID |   count | 
+-------------+------------------+ 
|   3 |    1445 | 
|   4 |    105 | 
|   9 |    160 | 
|   10 |    30 | 
+-------------+------------------+ 
BBB 
+-------------+------------------+ 
|   ID |   count | 
+-------------+------------------+ 
|   3 |    1445 | 
|   4 |    105 | 
|   9 |    160 | 
|   10 |    30 | 
+-------------+------------------+ 
CCC 
+-------------+------------------+ 
|   ID |   count | 
+-------------+------------------+ 
|   3 |    1445 | 
|   4 |    105 | 
|   9 |    160 | 
|   10 |    30 | 
+-------------+------------------+ 

我無法獨自從BBB提取值,並將其追加到像一個列表的文本文件中提取一些值

f = open(sys.argv[1], "r") 
text = f.readlines() 
B_Values = [] 
for i in text: 
    if i.startswith("BBB"):(Example) 
     B_Values.append("only values of BBB") 
    if i.startswith("CCC"): 
     break 

print B_Values 

應導致

['|   3 |    1445 |','|   4 |    105 |','|   9 |    160 |','|   10 |    30 |'] 
+0

它是功課嗎? –

回答

3
d = {} 
with open(sys.argv[1]) as f: 
    for line in f: 
     if line[0].isalpha(): # is first character in the line a letter? 
      curr = d.setdefault(line.strip(), []) 
     elif filter(str.isdigit, line): # is there any digit in the line? 
      curr.append(line.strip()) 

此文件,d現在是:

{'AAA': ['|   3 |    1445 |', 
     '|   4 |    105 |', 
     '|   9 |    160 |', 
     '|   10 |    30 |'], 
'BBB': ['|   3 |    1445 |', 
     '|   4 |    105 |', 
     '|   9 |    160 |', 
     '|   10 |    30 |'], 
'CCC': ['|   3 |    1445 |', 
     '|   4 |    105 |', 
     '|   9 |    160 |', 
     '|   10 |    30 |']} 

B_valuesd['BBB']

0

您可以使用一個狀態標誌bstarted當B組已經開始跟蹤。 掃描B組後,刪除三個標題行和一個頁腳行。

B_Values = [] 
bstarted = False 
for i in text: 
    if i.startswith("BBB"): 
     bstarted = True 
    elif i.startswith("CCC"): 
     bstarted = False 
     break 
    elif bstarted: 
     B_Values.append(i) 

del B_Values[:3] # get rid of the header 
del B_Values[-1] # get rid of the footer 
print B_Values 
0

您應該避免遍歷已讀取的行。只要你想讀下一行和檢查,看看它是什麼調用的ReadLine:

f = open(sys.argv[1], "r") 
B_Values = [] 
while i != "": 
    i = f.readline() 
    if i.startswith("BBB"): #(Example) 
     for temp in range(3): 
      f.skipline() #Skip the 3 lines of table headers 
     i = f.readline() 
     while i != "+-------------+------------------+" and i !="": 
      #While we've not reached the table footer 
      B_Values.append(i) 
      i = f.readline() 
     break 

#Although not necessary, you'd better put a close function there, too. 
f.close() 

print B_Values 

編輯:@eumiro的方法比我更靈活。因爲它讀取所有部分的所有值。雖然您可以在我的示例中執行isalpha測試以讀取所有值,但他的方法仍然更易於閱讀。

相關問題