2012-11-11 34 views
1

我有以下數據的純文本文件:查找文件中的某一行,然後閱讀接下來的幾行Python中

id=1 
name=Scott 
occupation=Truck driver 
age=23 

id=2 
name=Dave 
occupation=Waiter 
age=16 

id=3 
name=Susan 
occupation=Computer programmer 
age=29 

我試圖找出最佳的方式來獲得在任何點該文件給出一個id字符串,然後抓取下面的行以提取數據以供我的程序使用。我可以這樣做:

def get_person_by_id(id): 
    file = open('rooms', 'r') 
    for line in file: 
     if ("id=" + id) in line: 
      print(id + " found") 

但我不知道我怎麼能走到現在通過線下一個串並做line.split("=")或類似的提取信息(投入列表或字典或其他)說我可以使用我的程序。任何指針?

+0

是否所有數據都可用於每個ID,或者某些記錄的信息比其他記錄少? –

+0

很大程度上取決於您對格式的瞭解。每個條目總是4行嗎?有沒有其他的鑰匙?基本上,你可以多次調用'file.readline()'。 –

+1

你可以/可以改變文件格式嗎?如果可以的話,你可以使用csv模塊。請參閱:http://docs.python.org/2/library/csv.html。也許你可以讓csv模塊也適用於這種情況。 –

回答

2

一種選擇是將整個事情加載到內存中,這將節省您的每一次讀取該文件:

with open('rooms') as f: 
    chunks = f.read().split('\n\n') 

people_by_id = {} 

for chunk in chunks: 
    data = dict(row.split('=', 1) for row in chunk.split('\n')) 
    people_by_id[data['id']] = data 
    del data['id'] 

def get_person_by_id(id): 
    return people_by_id.get(id) 
+0

如果文件非常大,最好不要將整個文件讀入內存,而是停止特定行的文件處理。其他一些答案提供了這樣的解決方案。 – btel

0

如何找到正確的行後從退出for循環:

def get_person_by_id(id): 
    file = open('rooms', 'r') 
    for line in file: 
     if ("id=" + id) in line: 
      print(id + " found") 
      break 
    #now you can continue processing your file: 
    next_line = file.readline() 
0

可能:

d = dict() 

with open(filename) as f: 
    for line in f: 
     k,v = line.split('=') 
     if 'id=' in line: 
      d[v] = {} 
     d[d.keys()[-1]][k] = v 
0

獲取所有的人的屬性和值( ID,姓名,職業,年齡等),直到找到 empy行。

def get_person_by_id(id): 
    person = {} 
    file = open('rooms', 'r') 
    for line in file: 
     if found == True: 
      if line.strip(): 
       attr, value = line.split("="): 
      else: 
       return person    
     elif ("id=" + id) in line: 
      print(id + " found") 
      found = True 
      attr, value = line.split("=") 
      person[attr] = value 
    return person 
0

這裏是一個迭代的解決方案。

objects = [] 
current_object = None 
with open("info.txt", "rb") as f: 
    for line in f: 
     line = line.strip("\r\n") 
     if not line: 
      current_object = None 
      continue 
     if current_object is None: 
      current_object = {} 
      objects.append(current_object) 
     key,_,value = line.partition('=') 
     current_object[key] = value 

print objects 
0

反覆解析器的另一個例子:

from itertools import takewhile 
def entries(f): 
    e = {} 
    def read_one(): 
     one = {} 
     for line in takewhile(lambda x: '=' in x, f): 
      key, val = line.strip().split('=') 
      one[key] = val 
     return one 
    while True: 
     one = read_one() 
     if not one: 
      break 
     else: 
      e[one.pop('id')] = one 
    return e 

實施例:

>>> with open('data.txt') as f: 
..: print entries(f)['2'] 
{'age': '16', 'occupation': 'Waiter', 'name': 'Dave'} 
0

該溶液是有點更寬容的記錄內的空行。

def read_persons(it): 
    person = dict() 
    for l in it: 
     try: 
      k, v = l.strip('\n').split('=', 1) 
     except ValueError: 
      pass 
     else: 
      if k == 'id': # New record 
       if person: 
        yield person 
        person = dict() 
      person[k] = v 
    if person: 
     yield person 
相關問題