在一個文件，並將其寫入到一個新的文件

好了，所以我有一個交易文件糾正錯誤：在一個文件，並將其寫入到一個新的文件

IN CU 
    Customer_ID= 
    Last_Name=Johnston 
    First_Name=Karen 
    Street_Address=291 Stone Cr 
    City=Toronto 
// 
IN VE 
    License_Plate#=LSR976 
    Make=Cadillac 
    Model=Seville 
    Year=1996 
    Owner_ID=779 
// 
IN SE 
    Vehicle_ID=LSR976 
    Service_Code=461 
    Date_Scheduled=00/12/19

IN意味着插入和CU（指客戶）是指我們正在寫什麼文件也一樣，這個案例是customer.diff。我遇到的問題是我需要檢查每一行，並檢查每個字段的值（例如Customer_ID）。你看到Customer_ID是如何留空的？我需要用值0替換任何數字空白字段，所以在這種情況下例如Customer_ID=0。這裏是我到目前爲止，但沒有正在發生變化：

def insertion(): 
    field_names = {'Customer_ID=': 'Customer_ID=0', 
'Home_Phone=':'Home_Phone=0','Business_Phone=': 'Business_Phone=0'} 

    with open('xactions.two.txt', 'r') as from_file: 
     search_lines = from_file.readlines() 


    if search_lines[3:5] == 'CU': 
     for i in search_lines: 
      if field_names[i] == True: 
       with open('customer.diff', 'w') as to_file: 
        to_file.write(field_names[i])

感謝

來源

2014-03-19 Amon

爲什麼不只是'if field_names [i]'？ 'field_names [i]'不會評估爲「真」。 – benjamin

對不起，只有''Home_Phone ='：'Home_Phone = 0'，'Business_Phone ='：'Business_Phone = 0''也能夠改變'Customer_ID'。 –

@benjamin我已經嘗試了兩種，但都沒有工作:( – Amon

爲什麼不嘗試一些簡單一些？我沒有測試過這個代碼。

def insertion(): 
    field_names = {'Customer_ID=': 'Customer_ID=0', 
'Home_Phone=':'Home_Phone=0','Business_Phone=': 'Business_Phone=0'} 

with open('xactions.two.txt', 'r') as from_file: 
    with open('customer.diff', 'w') as to_file: 
     for line in from_file: 
      line = line.rstrip("\n") 
      found = False 
      for field in field_names.keys(): 
       if field in line: 
        to_file.write(line + "0") 
        found = True 
      if not found: 
       to_file.write(line) 
      to_file.write("\n")

來源

2014-03-19 01:07:46 benjamin

我得到一個錯誤，告訴我「字典對象沒有屬性iter_keys' – Amon

的確應該是iterkeys，而不是iter_keys。謝謝@Matthew – benjamin

它仍然給我相同的屬性錯誤 – Amon

這是一個相當全面的方法;它有點長，但不像看起來那麼複雜！

我假定Python 3.x，但它應該在Python 2.x中工作，但幾乎沒有變化。我廣泛使用生成器來傳輸數據，而不是將其保存在內存中。

首先：我們將爲每個字段定義預期的數據類型。某些字段不符合內置Python的數據類型，所以我定義這些字段的一些自定義數據類型開始：

import time 

class Date: 
    def __init__(self, s): 
     """ 
     Parse a date provided as "yy/mm/dd" 
     """ 
     if s.strip(): 
      self.date = time.strptime(s, "%y/%m/%d") 
     else: 
      self.date = time.gmtime(0.) 

    def __str__(self): 
     """ 
     Return a date as "yy/mm/dd" 
     """ 
     return time.strftime("%y/%m/%d", self.date) 

def Int(s): 
    """ 
    Parse a string to integer ("" => 0) 
    """ 
    if s.strip(): 
     return int(s) 
    else: 
     return 0 

class Year: 
    def __init__(self, s): 
     """ 
     Parse a year provided as "yyyy" 
     """ 
     if s.strip(): 
      self.date = time.strptime(s, "%Y") 
     else: 
      self.date = time.gmtime(0.) 

    def __str__(self): 
     """ 
     Return a year as "yyyy" 
     """ 
     return time.strftime("%Y", self.date)

現在，我們建立了一個表，定義每個字段應該是什麼類型：

# Expected data-type of each field: 
# data_types[section][field] = type 
data_types = { 
    "CU": { 
     "Customer_ID": Int, 
     "Last_Name":  str, 
     "First_Name":  str, 
     "Street_Address": str, 
     "City":   str 
    }, 
    "VE": { 
     "License_Plate#": str, 
     "Make":   str, 
     "Model":   str, 
     "Year":   Year, 
     "Owner_ID":  Int 
    }, 
    "SE": { 
     "Vehicle_ID":  str, 
     "Service_Code": Int, 
     "Date_Scheduled": Date 
    } 
}

我們解析輸入文件;這是迄今爲止最複雜的一點！這是作爲發電機的功能實現的有限狀態機，同時產生一個部分：

# Customized error-handling 
class TransactionError   (BaseException): pass 
class EntryNotInSectionError (TransactionError): pass 
class MalformedLineError  (TransactionError): pass 
class SectionNotTerminatedError(TransactionError): pass 
class UnknownFieldError  (TransactionError): pass 
class UnknownSectionError  (TransactionError): pass 

def read_transactions(fname): 
    """ 
    Read a transaction file 
    Return a series of ("section", {"key": "value"}) 
    """ 
    section, accum = None, {} 
    with open(fname) as inf: 
     for line_no, line in enumerate(inf, 1): 
      line = line.strip() 

      if not line: 
       # blank line - skip it 
       pass 
      elif line == "//": 
       # end of section - return any accumulated data 
       if accum: 
        yield (section, accum) 
       section, accum = None, {} 
      elif line[:3] == "IN ": 
       # start of section 
       if accum: 
        raise SectionNotTerminatedError(
         "Line {}: Preceding {} section was not terminated" 
         .format(line_no, section) 
        ) 
       else: 
        section = line[3:].strip() 
        if section not in data_types: 
         raise UnknownSectionError(
          "Line {}: Unknown section type {}" 
          .format(line_no, section) 
         ) 
      else: 
       # data entry: "key=value" 
       if section is None: 
        raise EntryNotInSectionError(
         "Line {}: '{}' should be in a section" 
         .format(line_no, line) 
        ) 
       pair = line.split("=") 
       if len(pair) != 2: 
        raise MalformedLineError(
         "Line {}: '{}' could not be parsed as a key/value pair" 
         .format(line_no, line) 
        ) 
       key,val = pair 
       if key not in data_types[section]: 
        raise UnknownFieldError(
         "Line {}: unrecognized field name {} in section {}" 
         .format(line_no, key, section) 
        ) 
       accum[key] = val.strip() 

     # end of file - nothing should be left over 
     if accum: 
      raise SectionNotTerminatedError(
       "End of file: Preceding {} section was not terminated" 
       .format(line_no, section) 
      )

現在，該文件被讀取，剩下的就是更容易。我們做類型轉換上的每個字段，用我們上面定義的查找表：

def format_field(section, key, value): 
    """ 
    Cast a field value to the appropriate data type 
    """ 
    return data_types[section][key](value) 

def format_section(section, accum): 
    """ 
    Cast all values in a section to the appropriate data types 
    """ 
    return (section, {key:format_field(section, key, value) for key,value in accum.items()})

和結果寫回文件：

def write_transactions(fname, transactions): 
    with open(fname, "w") as outf: 
     for section,accum in transactions: 
      # start section 
      outf.write("IN {}\n".format(section)) 
      # write key/value pairs in order by key 
      keys = sorted(accum.keys()) 
      for key in keys: 
       outf.write(" {}={}\n".format(key, accum[key])) 
      # end section 
      outf.write("//\n")

所有機器到位;我們只需要將它稱爲：

def main(): 
    INPUT = "transaction.txt" 
    OUTPUT = "customer.diff" 
    transactions = read_transactions(INPUT) 
    cleaned_transactions = (format_section(section, accum) for section,accum in transactions) 
    write_transactions(OUTPUT, cleaned_transactions) 

if __name__=="__main__": 
    main()

希望幫助！

來源

2014-03-19 05:49:03

在一個文件，並將其寫入到一個新的文件

回答

相關問題