2016-09-20 166 views
4

我在練習從谷歌財經股市數據導入熊貓數據幀列中的數據時,Python的錯誤:調用從熊貓數據框中

import pandas as pd 
from pandas import Series 

path = 'http://www.google.com/finance/historical?cid=542029859096076&startdate=Sep+22%2C+2001&enddate=Sep+20%2C+2016&num=30&ei=3HvhV4n3D8XGmAGp4q74Ag&output=csv' 
df = pd.read_csv(path) 

到目前爲止好,和DF也顯示了完整的數據集,我需要。

但是,調用特定列的時候,像

df['Date'] 

的Python示出下面的錯誤代碼:

Traceback (most recent call last): 

    File "<ipython-input-31-cb486dd31fbc>", line 1, in <module> 
    df['Date'] 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/core/frame.py", line 1997, in __getitem__ 
    return self._getitem_column(key) 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/core/frame.py", line 2004, in _getitem_column 
    return self._get_item_cache(key) 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/core/generic.py", line 1350, in _get_item_cache 
    values = self._data.get(item) 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/core/internals.py", line 3290, in get 
    loc = self.items.get_loc(item) 

    File "/Users/Username/anaconda/lib/python3.5/site-packages/pandas/indexes/base.py", line 1947, in get_loc 
    return self._engine.get_loc(self._maybe_cast_indexer(key)) 

    File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:4154) 

    File "pandas/index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas/index.c:4018) 

    File "pandas/hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368) 

    File "pandas/hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322) 

KeyError: 'Date' 

在另一方面,其它的塔,例如DF [ '高']原來沒問題。無論如何,我可以解決這個問題嗎?

+1

當我嘗試它工作正常,正確分析。 – ayhan

+0

(基於MaxU的回答,它可能正常工作,因爲我使用Python 3.5)。 – ayhan

+0

@ayhan,did'df ['Date']'爲你工作嗎?它不應該也在Python 3.5下工作... – MaxU

回答

5

這個CSV文件包含BOM (Byte Order Mark) signature,所以試試這種方法:

df = pd.read_csv(path, encoding='utf-8-sig') 

如何可以很容易地找出這個問題(感謝@jezrael's hint):

In [11]: print(df.columns.tolist()) 
['\ufeffDate', 'Open', 'High', 'Low', 'Close', 'Volume'] 

,並在第一列注意

注意:作爲@ayhan已經注意到,從版本0.1開始9.0 Pandas will take care of it automatically

的Bug pd.read_csv()造成BOM文件被不忽略BOM GH4793

+0

嘿謝謝!這樣可以很好地工作。您能否更詳細地解釋一下爲什麼它會產生差異,或者指出一些關於BOM簽名的來源?再次感謝。 –

+3

更好看,如果使用'print(df.columns.tolist())',+1 – jezrael