2015-12-02 96 views
0

每場我都大熊貓不能GROUPBY在CSV

Hour, L, Dr, Tag, Code, Vge 
0, L5, XI, PS, 4R,  15 
5, L5, XI, PS, 4R,  12 
2, L0, St, v2T, 4R,  11 
8, L2, TI, sst, 4R,  8 
12, L5, XI, PS, 4R,  18 

我用下面的Python code.The想法是環繞大熊貓GROUPBY概念我的頭以下的CSV文件。

# !/usr/bin/env python3.4 
# -*- coding: utf-8 -*- 

import pandas as pd 
import pprint 

df = pd.read_csv('in.csv') 
gb = df.groupby('Hour') 
pprint.pprint(list(gb)) 

這是我得到的輸出。

[(0, 
     Hour  L Dr Tag Code Vge 
0  0  L5 XI PS 4R 15), 
(2, 
    Hour  L Dr Tag Code  Vge 
2  2   L0 St v2T 4R  11), 
(5, 
    Hour  L Dr Tag Code  Vge 
1  5   L5 XI PS 4R  12), 
(8, 
    Hour  L Dr Tag Code  Vge 
3  8   L2 TI sst 4R  8), 
(12, 
    Hour  L Dr Tag Code  Vge 
4 12   L5 XI PS 4R  18)] 

上面的輸出是有意義的。但是,如果我這樣做,而不是gb = df.groupby('Vge')gb = df.groupby('Hour')在上面的代碼中,我得到以下錯誤

Traceback (most recent call last): 
    File "C:/Test/python/concepts/pandas/test_pandas.py", line 12, in <module> 
    gb = df.groupby('Vge') 
    File "C:\Python34\lib\site-packages\pandas\core\generic.py", line 3324, in groupby 
    sort=sort, group_keys=group_keys, squeeze=squeeze) 
    File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 1252, in groupby 
    return klass(obj, by, **kwds) 
    File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 416, in __init__ 
    level=level, sort=sort) 
    File "C:\Python34\lib\site-packages\pandas\core\groupby.py", line 2166, in _get_grouper 
    in_axis, name, gpr = True, gpr, obj[gpr] 
    File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 1914, in __getitem__ 
    return self._getitem_column(key) 
    File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 1921, in _getitem_column 
    return self._get_item_cache(key) 
    File "C:\Python34\lib\site-packages\pandas\core\generic.py", line 1090, in _get_item_cache 
    values = self._data.get(item) 
    File "C:\Python34\lib\site-packages\pandas\core\internals.py", line 3102, in get 
    loc = self.items.get_loc(item) 
    File "C:\Python34\lib\site-packages\pandas\core\index.py", line 1692, in get_loc 
    return self._engine.get_loc(_values_from_object(key)) 
    File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:3979) 
    File "pandas\index.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandas\index.c:3843) 
    File "pandas\hashtable.pyx", line 668, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12265) 
    File "pandas\hashtable.pyx", line 676, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12216) 
KeyError: 'Vge' 

有人可以解釋爲什麼發生這種情況?

回答

1

請輸入df.columns以查看您的列的名稱。我懷疑你的名爲'Vge'的列實際上並沒有命名爲'Vge'。

如果是這樣的:

df.columns = ['Hour', 'L', 'Dr', 'Tag', 'Code', 'Vge'] 
gb = df.groupby('Vge') 
print(gb)