2017-07-27 87 views
1

我想將包含各種股票價格的.csv文件導入到getData()函數內部的Python腳本中,但我遇到了索引問題,無法看到如何解決問題。在Python中將CSV文件讀取到多個NumPy數組

我是新來的CSV和NumPy的都這樣我不確定是哪裏的問題恰好,但是當我嘗試運行此代碼我收到以下內容:

文件「../StockPlot.py」,行20,在的getData 日期[I-1] =數據[0] IndexError:索引0超出範圍爲0軸與大小0

import numpy as np 
import matplotlib.pyplot as plt 
import csv 

def getData(): 
    date = np.array([]) 
    openPrice = np.array([]) 
    closePrice = np.array([]) 
    volume = np.array([]) 

    i = 1 
    with open('aapl.csv', 'rb') as f: 
     reader = csv.reader(open('aapl.csv')) 
     data_as_list = list(reader) 
     items = len(data_as_list) 

     while i < items: 
      data = data_as_list[i] 
      date[i-1] = data[0] 
      openPrice[i-1] = data[1] 
      closePrice[i-1] = data[4] 
      volume[i-1] = data[5] 
      i += 1 

    return date, openPrice, closePrice, volume 

getData() 

我試圖讀取的文件AAPL.csv具有線形式:

Date, Open, High, Low, Close, Volume

26-Jul-17,153.35,153.93,153.06,153.46,15415545

25-Jul-17,151.80,153.84,151.80,152.74,18853932

24-Jul-17,150.58,152.44,149.90,152.09,21493160

我將不勝感激解決這個問題的任何幫助,似乎data_as_list是每行的列表的列表,並在玩打印功能後,似乎是打印數據[0]等循環,但不會讓我的值分配給我創造

回答

4

陣列IMO它更方便使用熊貓爲:

import pandas as pd 

fn = r'/path/to/AAPL.csv'  
df = pd.read_csv(fn, skipinitialspace=True, parse_dates=['Date']) 

結果:

In [83]: df 
Out[83]: 
     Date Open High  Low Close Volume 
0 2017-07-26 153.35 153.93 153.06 153.46 15415545 
1 2017-07-25 151.80 153.84 151.80 152.74 18853932 
2 2017-07-24 150.58 152.44 149.90 152.09 21493160 

As numpy 2D array:

In [84]: df.values 
Out[84]: 
array([[Timestamp('2017-07-26 00:00:00'), 153.35, 153.93, 153.06, 153.46, 15415545], 
     [Timestamp('2017-07-25 00:00:00'), 151.8, 153.84, 151.8, 152.74, 18853932], 
     [Timestamp('2017-07-24 00:00:00'), 150.58, 152.44, 149.9, 152.09, 21493160]], dtype=object)