我試圖複製一個例子了韋斯·麥金尼的書對大熊貓的代碼是在這裏(它假定名稱的文件夾下的所有名稱的數據文件都)熊貓集團示例錯誤
# -*- coding: utf-8 -*-
import numpy as np
import pandas as pd
years = range(1880, 2011)
pieces = []
columns = ['name', 'sex', 'births']
for year in years:
path = 'names/yob%d.txt' % year
frame = pd.read_csv(path, names=columns)
frame['year'] = year
pieces.append(frame)
names = pd.concat(pieces, ignore_index=True)
names
def get_tops(group):
return group.sort_index(by='births', ascending=False)[:1000]
grouped = names.groupby(['year','sex'])
grouped.apply(get_tops)
我使用熊貓0.10 Python 2.7。我看到的錯誤是這樣的:
Traceback (most recent call last):
File "names.py", line 21, in <module>
grouped.apply(get_tops)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/groupby.py", line 321, in apply
return self._python_apply_general(f)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/groupby.py", line 324, in _python_apply_general
keys, values, mutated = self.grouper.apply(f, self.obj, self.axis)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/groupby.py", line 585, in apply
values, mutated = splitter.fast_apply(f, group_keys)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/groupby.py", line 2127, in fast_apply
results, mutated = lib.apply_frame_axis0(sdata, f, names, starts, ends)
File "reduce.pyx", line 421, in pandas.lib.apply_frame_axis0 (pandas/lib.c:24934)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/frame.py", line 2028, in __setattr__
self[name] = value
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/frame.py", line 2043, in __setitem__
self._set_item(key, value)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/frame.py", line 2078, in _set_item
value = self._sanitize_column(key, value)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.10.0-py2.7-linux-i686.egg/pandas/core/frame.py", line 2112, in _sanitize_column
raise AssertionError('Length of values does not match '
AssertionError: Length of values does not match length of index
任何想法?
對此我很抱歉:我對自己介紹0.10的這個bug非常惱火,它在git repo中得到了修復,我將在熊貓發佈過程中添加「測試所有書本代碼」。 –