2014-04-10 30 views
0

我正在嘗試使用python的CSV模塊來修改CSV文件。該文件代表股票並列出當天的日期,開盤價格,高價格,低價格,收盤價格和交易量(列)。我想要做的是通過對現有數據執行代數來創建多個新列。例如,我想創建一個從開盤價到高價的百分比列,以及從昨天收盤到今天收盤的百分比變化的百分比列表(現在沒有任何結論,就像現在考慮的那樣10列添加)。使用python讀取/寫入/附加到CSV

有沒有一個緊湊的方法來做到這一點?截至目前,我正在打開原始文件並讀入感興趣的值列表。然後使用該列表向臨時文件寫入修改後的值。然後使用for循環寫入一些新文件並添加每個電子表格中的行。然後將該新文件的全部內容寫入原始csv,因爲我想保留csv(ticker.csv)的名稱。

希望我已經明確了我的問題。如果您需要任何澄清或進一步的細節,請不要猶豫。

編輯:我已經在下面包含了一個函數的代碼片段。該函數試圖創建一個新的列,其中的變化百分比從昨天接近到今天收盤。

def add_col_pchange(ticker): 
    """ 
    Add column with percent change in closing price. 
    """ 
    original = open('file1', 'rb') 
    reader = csv.reader(original) 
    reader.next() 
    close = list() 
    for row in reader: 
     # build list of close values; entries from left to right are reverse chronological 
     # index 4 corresponds to "Close" column 
     close.append(float(row[4]) 
    original.close() 

    new = open(file2, 'wb') 
    writer = csv.writer(new) 
    writer.writerow(["Percent Change"]) 
    pchange = list() 
    for i in (0, len(close)-1): 
     x = (close[i]-close[i+1])/close[i+1] 
     pchange.append(x) 
    new.close() 

    # open original and new csv's as read, write out to some new file. 
    # later, copy that entire file to original csv in order to maintain 
    # original csv's name and include new data 
+0

唉,不清楚(至少對我來說)。但是,我猜想你只需要一次傳遞就可以將數據讀入內存,另一次傳遞將數據寫出來 - 我懷疑中間文件是非常必要的。提供一個更清晰的解釋,至少是你的代碼的大綱。 –

+0

@LarryLustig - 編輯。謝謝回覆! – sgchako

+1

你可能想考慮使用熊貓圖書館。您可以將csv文件作爲數據框讀入,創建一個新的Series,並在寫出之前將該系列附加到數據框。 – lightalchemist

回答

0

希望這有助於

def add_col_pchange(ticker): 
    """ 
    Add column with percent change in closing price. 
    """ 
    # always use with to transparently manage opening/closing files 
    with open('ticker.csv', 'rb') as original: 
     spam = csv.reader(original) 
     headers = spam.next() # get header row 
     # get all of the data at one time, then transpose it using zip 
     data = zip(*[row for row in spam]) 
    # build list of close values; entries from left to right are reverse chronological 
    # index 4 corresponds to "Close" column 
    close = data[4] # the 5th column has close values 

    # use map to process whole column at one time 
    f_pchange = lambda close0, close1: 100 * (float(close0) - float(close1))/float(close1) 
    Ndays = len(close) # length of table 
    pchange = map(f_pchange, close[:-1], close[1:]) # list of percent changes 
    pchange = (None,) + tuple(pchange) # add something for the first or last day 
    headers.append("Percent Change") # add column name to headers 
    data.append(pchange) 
    data = zip(*data) # transpose back to rows 

    with open('ticker.csv', 'wb') as new: 
     spam = csv.writer(new) 
     spam.writerow(headers) # write headers 
     for row in data: 
      spam.writerow(row) 

    # open original and new csv's as read, write out to some new file. 
    # later, copy that entire file to original csv in order to maintain 
    # original csv's name and include new data 

你應該檢查出任何numpy;你可以使用loadtxt()和向量​​數學,但@lightalchemist是正確的,pandas就是爲此而設計的。