2016-05-24 105 views
0

因此,我正在收集數據並將此數據保存到csv文件中,但出於演示目的,我想根據相關的「訂單」對各個csv文件中的列進行重新排序。Python:對csv文件的列進行重新排序

我用的是這個問題(write CSV columns out in a different order in Python)作爲指導,但我不知道爲什麼我收到錯誤

writeindices = [name2index[name] for name in writenames] 
KeyError: % Processor Time 

當我運行它。請注意,此錯誤似乎不僅限於字符串% Processor Time'

我哪裏錯了?

這裏是我的代碼:

CPU_order=["%"+" Processor Time", "%"+" User Time", "Other"] 
Memory_order=["Available Bytes", "Pages/sec", "Pages Output/sec", "Pages Input/sec", "Page Faults/sec"] 

def reorder_csv(path,title,input_file): 
    if title == 'CPU': 
     order=CPU_order 
    elif title == 'Memory': 
     order=Memory_order 

    output_file=path+'/'+title+'_reorder'+'.csv' 

    writenames = order 

    reader = csv.reader(input_file) 
    writer = csv.writer(open(output_file, 'wb')) 

    readnames = reader.next() 
    name2index = dict((name, index) for index, name in enumerate(readnames)) 
    writeindices = [name2index[name] for name in writenames] 
    reorderfunc = operator.itemgetter(*writeindices) 
    writer.writerow(writenames) 

    for row in reader: 
     writer.writerow(reorderfunc(row)) 

這裏是輸入CSV文件看起來像一個示例:

,CPU\% User Time,CPU\% Processor Time,CPU\Other 
05/23/2016 06:01:51.552,0,0,0 
05/23/2016 06:02:01.567,0.038940741537158409,0.62259056657940626,0.077882481554869071 
05/23/2016 06:02:11.566,0.03900149141703179,0.77956981074955856,0 
05/23/2016 06:02:21.566,0,0,0 
05/23/2016 06:02:31.566,0,1.1695867249963632,0 
+2

請發佈您的'input_file'的內容! **更新:**特別是標題行。 – schwobaseggl

回答

1

你的代碼工作。這是您的數據沒有名爲「%處理器時間」的列。下面是一個簡單的數據我用:

Other,% User Time,% Processor Time 
o1,u1,p1 
o2,u2,p2 

這裏是我稱之爲代碼:

reorder_csv('.', 'CPU', open('data.csv')) 

通過這些設置,一切工作正常。請檢查您的數據。

更新

現在,我看到您的數據,它看起來像你有列名,例如「CPU \%處理器時間」,想寫出它爲「%處理器時間」前翻譯。所有你需要做的就是創建name2index這樣:

name2index = dict((name.replace('CPU\\', ''), index) for index, name in enumerate(readnames)) 

這裏的區別是,而不是name,你應該有name.replace('CPU\\', ''),它擺脫了CPU的\一部分。

更新2

我返工你的代碼使用csv.DictReadercsv.DictWriter。我還假設「CPU \%特權時間」將轉換爲「其他」。如果不是這種情況,您可以在transformer字典中修復它。

import csv 
import os 

def rename_columns(row): 
    """ Take a row (dictionary) of data and return a new row with columns renamed """ 
    transformer = { 
     'CPU\\% User Time': '% User Time', 
     'CPU\\% Processor Time': '% Processor Time', 
     'CPU\\% Privileged Time': 'Other', 
     } 
    new_row = {transformer.get(k, k): v for k, v in row.items()} 
    return new_row 

def reorder_csv(path, title, input_file): 
    header = dict(
     CPU=["% Processor Time", "% User Time", "Other"], 
     Memory=["Available Bytes", "Pages/sec", "Pages Output/sec", "Pages Input/sec", "Page Faults/sec"], 
     ) 

    reader = csv.DictReader(input_file) 
    output_filename = os.path.join(path, '{}_reorder2.csv'.format(title)) 

    with open(output_filename, 'wb') as outfile: 
     # Create a new writer where each row is a dictionary. 
     # If the row contains extra keys, ignore them 
     writer = csv.DictWriter(outfile, header[title], extrasaction='ignore') 
     writer.writeheader() 
     for row in reader: 
      # Each row is a dictionary, not list 
      print row 
      row = rename_columns(row) 
      print row 
      print 
      writer.writerow(row) 
+0

謝謝,我的數據在反斜槓之前有文本(我已經更新了上面的問題),但是我想因爲我正在查找給定的字符串「in」,它應該仍然有效? – Catherine

+0

使用新的name2index替換''''''我仍然得到'KeyError:'%Processor Time'' – Catherine

+0

我注意到你的csv缺少時間標記(第一列)的標題。這是問題嗎?它有助於您以原始形式發佈csv樣本。 –