輸出爲CSV：更改現有代碼以添加「標記」列而不是

代碼（以下轉載）讀入文件，執行操作並將原始文件的子集輸出到新文件中。我怎麼調整它一點點，而是輸出從初始文件到輸出文件的所有內容，但添加一個「標誌」列，值爲「1」，其中行是當前要輸出的行我們最感興趣的行子集）？其他行（當前僅在輸入文件中的行）將在新的「標誌」列中具有空白或「0」。輸出爲CSV：更改現有代碼以添加「標記」列而不是

這個問題對我來說足夠頻繁地發生，它會爲我節省很多時間只是爲了擁有這樣做的一般方式。

非常感謝任何幫助！

import csv 
inname = "aliases.csv" 
outname = "output.csv" 

def first_word(value): 
    return value.split(" ", 1)[0] 

with open(inname, "r", encoding = "utf-8") as infile: 
    with open(outname, "w", encoding = "utf-8") as outfile: 
     in_csv = csv.reader(infile) 
     out_csv = csv.writer(outfile) 

     column_names = next(in_csv) 
     out_csv.writerow(column_names) 

     id_index = column_names.index("id") 
     name_index = column_names.index("name") 

     try: 
      row_1 = next(in_csv) 
      written_row = False 

      for row_2 in in_csv: 
       if first_word(row_1[name_index]) == first_word(row_2[name_index]) and row_1[id_index] != row_2[id_index]: 
        if not written_row: 
         out_csv.writerow(row_1) 

        out_csv.writerow(row_2) 
        written_row = True 
       else: 
        written_row = False 

       row_1 = row_2 
     except StopIteration: 
      # No data rows! 
      pass

來源

2012-08-10 user1590499

我在編寫CSV時總是使用DictReader，主要是因爲它更明確一點（這讓我更容易:)）。以下是你可以做的一個高度風格化的版本。我所做的更改包括：

使用csv.DictReader()和csv.DictWriter()，而不是csv.reader和csv.writer。這通過使用字典來表示行而不是列表而不同，這意味着行看起來像{'column_name': 'value', 'column_name_2': 'value2'}。這意味着每行都包含列標題數據，也可以像字典一樣對待。
使用示例列名顯示讀/寫的工作方式。我做了有兩列的樣本CSV：書寫時name和number，然後，我做了一個簡單的檢查，看看是否number值> 2

考慮到這一點，這裏是例子：

import csv 

input_csv = 'aliases.csv' 
output_csv = 'output.csv' 

def first_word(value): 
    return value.split(' ', 1)[0] 

with open(input_csv, 'r') as infile: 
    # Specify the fieldnames in your aliases CSV 
    input_fields = ('name', 'number') 

    # Set up the DictReader, which will read the file into an iterable 
    # where each row is a {column_name: value} dictionary 
    reader = csv.DictReader(infile, fieldnames=input_fields) 

    # Now open the output file 
    with open(output_csv, 'w') as outfile: 
     # Define the new 'flag' field 
     output_fields = ('name', 'number', 'flag') 
     writer = csv.DictWriter(outfile, fieldnames=output_fields) 

     # Write the column names (this is a handy convention seen elsewhere on SO) 
     writer.writerow(dict((h, h) for h in output_fields)) 

     # Skip the first row (which is the column headers) and then store the 
     # first row dictionary 
     next(reader) 
     first_row = next(reader) 

     # Now begin your iteration through the input, writing all fields as they 
     # appear, but using some logic to write the 'flag' field 
     # This is where the dictionary comes into play - 'row' is actually a 
     # dictionary, so you can use dictionary syntax to assign to it 
     for next_row in reader: 
      # Set up the variables for your comparison 
      first_name = first_word(first_row['name']) 
      next_name = first_word(next_row['name']) 
      first_id = first_row['number'] 
      next_id = next_row['number'] 

      # Compare the current row to the previous row 
      if first_name == next_name and first_id != next_id: 
       # Here we are adding an element to our row dictionary - 'flag' 
       first_row['flag'] = 'Y' 
      # Now we write the entire first_row dictionary to the row 
      writer.writerow(first_row) 

      # Change the reference, just like you did 
      first_row = next_row

來源

2012-08-11 00:07:08 RocketDonkey

感謝您的文章。儘管如此，這對我並不適用。首先，我得到了一行語法錯誤：row ['flag'] ='Y'。正如我想的那樣，這不是一個有效的操作。對，我的意思是我們想爲「標誌」列添加一個'Y'，但它看起來像使用一個列表，就好像它是一個字典或類似的東西。我不確定，但語法不起作用，它只有在我賦予賦值運算符成爲相同的運算符時纔有效，而且這沒有意義。另外，writer.writerow（row）語句不起作用。 – user1590499 2012-08-11 00:48:47

這裏的另一個問題是，我不確定這是否符合我的要求。輸入文件中還有許多其他字段，我希望它們位於輸出文件中，但是我的邏輯比較是基於特定列中的某些值。它看起來在這裏，它讓我感到我們只會得到3列「名稱」「數字」和「標誌」。是對的嗎？ – user1590499 2012-08-11 00:52:58

作爲一個附錄，我認爲我想如何進行這種調整的邏輯是：在寫出之前添加一個新的列到行中（這是一個Python列表）。由於我們無法就地修改輸入文件，因此我們可以創建一個新文件，然後用新文件替換舊文件。它似乎應該很簡單，但我不知道該怎麼去做，因爲我最初並沒有創建任何列表...... – user1590499 2012-08-11 00:55:45

輸出爲CSV：更改現有代碼以添加「標記」列而不是

回答

相關問題