解析字符串部分

您將使用什麼技術/模塊來解析特定的字符串部分。的類型的給定行：解析字符串部分

field 1: dog  field 2: first  comment: outstanding 
field 1: cat  field 2:    comment: some comment about the cat

字段名總是以冒號結束，則字段值可以爲空並且字段僅通過空格分隔。我只想訪問字段值。我知道我會如何使用正則表達式來做到這一點，但我相信有更多優雅的方法可以用Python做到這一點。

來源

2013-05-26 Chris Seymour

此選項卡分隔？ – jamylak

@jamylak不，只是空格。 –

看起來像正則表達式可能是去這裏的路，你怎麼知道什麼時候有另一個領域？是否總有不止一個空間來表明這一點？ – jamylak

這看起來像一個固定寬度的格式給我。如果你想有一個列表

{0: {'comment': 'outstanding', 'field 2': 'first', 'field 1': 'dog'}, 1: {'comment': 'some comment about the cat', 'field 2': '', 'field 1': 'cat'}}

：

如果是這樣，你可以這樣做：

data={} 
ss=((0,19),(20,41),(42,80)) 
with open('/tmp/p.txt','r') as f: 
    for n,line in enumerate(f): 
     fields={} 
     for i,j in ss: 
      field=line[i:j] 
      t=field.split(':') 
      fields[t[0].strip()]=t[1].strip() 
     data[n]=fields  

print data

打印

data=[] 
ss=((0,19),(20,41),(42,80)) 
with open('/tmp/p.txt','r') as f: 
    for n,line in enumerate(f): 
     fields={} 
     for i,j in ss: 
      field=line[i:j] 
      t=field.split(':') 
      fields[t[0].strip()]=t[1].strip() 
     data.append(fields)

在這兩種情況下，你可以訪問：

>>> data[0]['comment'] 
'outstanding'

來源

2013-05-26 20:40:38 dawg

我認爲'list'在這裏更合適，因爲使用'0'，'1'作爲鍵不會改善任何東西。通過索引訪問項目也是列表中的「O（1）」操作，它們也保持順序。 –

如果這很重要，那就輕鬆更換。我不認爲這是問題的主要觀點。 – dawg

事情是這樣的：

>>> with open("abc") as f: 
    lis = [] 
    for line in f: 
     lis.append(dict(map(str.strip, x.split(":")) for x in line.split(" "*8))) 
...   
>>> lis 
[{'comment': 'outstanding', 'field 2': 'first', 'field 1': 'dog'}, 
{'comment': 'some comment about the cat', 'field 2': '', 'field 1': 'cat'} 
] 

>>> lis[0]['comment'] #access 'comment' field on line 1 
'outstanding' 
>>> lis[1]['field 2'] # access 'field 2' on line 2 
''

來源

2013-05-26 19:09:50

另一種選擇是使用csv模塊。

假設有字段之間的製表符：

import StringIO 
import csv 

input_data = StringIO.StringIO("""field 1: dog field 2: first comment: outstanding 
field 1: cat field 2: comment: some comment about the cat""") 

data = [] 
for row in csv.reader(input_data, delimiter="\t"): 
    line = {} 
    for item in row: 
     value = item.split(":") 
     line[value[0]] = value[1].strip() 

    data.append(line) 

print data

打印

[{'comment': 'outstanding', 'field 2': 'first', 'field 1': 'dog'}, {'comment': 'some comment about the cat', 'field 2': '', 'field 1': 'cat'}]

來源

2013-05-26 19:19:27 alecxe

解析字符串部分

回答

相關問題