使用Python通過文本文件分隔符

我需要通過，看起來像一個文本文件來解析解析：使用Python通過文本文件分隔符

"id"$"date"$"text" 

    10001$2016-01-11$"[start] 
    this is some text 
    [stop] 
    " 
    10002$2014-03-12$"[start] 
    this is some more text 
    [stop] 
    "

與Python到庫中，並有這三個不同的元素（ID，日期和文字）作爲鑰匙。

我不知道如何使用分隔符分割這些元素，以及如何將第一行用作列表中所有元素的鍵。

能像這樣工作，甚至只打印：

infile = open('filename.txt', 'r') 
for line in infile: 
    if "????" in line: 
     print(line, next(infile))

如果我嘗試：

infile = open('filename.txt', 'r') 
    for line in infile: 
    if '"text"' in line: 
      print(next(infile))

只打印第一行。

這將理想的樣子：

[{'id':'10001', 'date':'2016-01-11', 'text':'this is some text'},{'id':'10002', 'date':'2014-03-12', 'text':'this is some more text'}]

來源

2016-04-14 nquestion

什麼樣的麻煩？向我們展示您使用的代碼。 –

我認爲如果你發佈文件本身就是最好的。 –

Python'list'沒有命名的「keys」。作爲Python類型，結果數據結構的期望格式是什麼？ – Kupiakos

import csv 
with open(path,'rb') as f: 
    reader = csv.reader(f,delimiter='$') 
    res = [ {'id':line[0],'date':line[1],'text':line[2]} for line in reader ] 
    res = res[1:]

來源

2016-04-14 20:44:08 galaxyan

'res [0]'來回到'{'date'：'date'，'text'：'text'，'id'：'id'}'但是之後的所有內容 – nquestion

@nquestion你可以刪除第一個元素 – galaxyan

@nquestion編輯 – galaxyan

您可以使用Python的內置csv庫來解析該文件。

import csv 


class Parser(object): 
    START_TEXT = "[start]" 
    END_TEXT = "[stop]" 

    def __init__(self, filename): 
     self.filename = filename 


    def parse_file(self): 
     elements = [] 

     with open(self.filename, 'r') as f: 
      reader = csv.reader(f, delimiter='$') 
      first_row = next(reader) 

      key0 = first_row[0] 
      key1 = first_row[1] 
      key2 = first_row[2] 

      for row in reader: 
       elements.append({ 
        key0: row[0], 
        key1: row[1], 
        key2: self.parse_text(row[2]), 
       }) 

     return elements 

    @classmethod 
    def parse_text(cls, text): 
     start_idx = text.index(cls.START_TEXT) 
     end_idx = text.index(cls.END_TEXT) 

     new_txt = text[start_idx + len(cls.START_TEXT):][:end_idx - len(cls.END_TEXT) - 1] 

     return new_txt.lstrip('\n').rstrip('\n') 


p = Parser("infile.txt") 
elements = p.parse_file() 

print elements

輸出：

[{'date': '2016-01-11', 'text': 'this is some text', 'id': '10001'}, {'date': '2014-03-12', 'text': 'this is some more text', 'id': '10002'}]

來源

2016-04-14 20:54:09 Eli

import csv 

with open('f.txt') as fp: 
    reader = csv.DictReader(fp, delimiter="$") 
    data = list(reader) 

for row in data: 
    row.update({ 
     k:v.replace('[start]','').replace('[stop]','').replace('\n','') 
     for k,v in row.items()}) 

print data

來源

2016-04-14 21:59:09

使用Python通過文本文件分隔符

回答

相關問題