3
編輯(19-May-2015):我剛剛證實,這已被固定爲版本0.16.1,所以這應該不是在最新版本中的問題。奇怪的結果與groupby,轉換和NaN
這些都應該給出相同的結果,對吧?
df.groupby(level=0).transform('mean')
df.groupby(level=0)['x'].transform(np.nanmean)
df.groupby(level=0)['x'].transform('mean')
前兩個都行,但是第三個行不通。可能是一個錯誤?
df = pd.DataFrame({ 'x':[1,np.nan,3,4] }, index=[1,1,2,2],)
df
Out[686]:
x
1 1
1 NaN
2 3
2 4
df.groupby(level=0).transform('mean')
Out[687]:
x
1 1.0
1 1.0
2 3.5
2 3.5
df.groupby(level=0)['x'].transform(np.nanmean)
Out[688]:
1 1.0
1 1.0
2 3.5
2 3.5
Name: x, dtype: float64
這是所有好的,但不是這樣的:
df.groupby(level=0)['x'].transform('mean')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-691-24761ee742fd> in <module>()
----> 1 df.groupby(level=0)['x'].transform('mean')
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\groupby.pyc in transform(self, func, *args, **kwargs)
2411 # if string function
2412 if isinstance(func, compat.string_types):
-> 2413 return self._transform_fast(lambda : getattr(self, func)(*args, **kwargs))
2414
2415 # do we have a cython function
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\groupby.pyc in _transform_fast(self, func)
2457 values = np.repeat(values, com._ensure_platform_int(counts))
2458
-> 2459 return self._set_result_index_ordered(Series(values))
2460
2461 def filter(self, func, dropna=True, *args, **kwargs):
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\groupby.pyc in _set_result_index_ordered(self, result)
495 result = result.sort_index()
496
--> 497 result.index = self.obj.index
498 return result
499
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\generic.pyc in __setattr__(self, name, value)
1978 try:
1979 object.__getattribute__(self, name)
-> 1980 return object.__setattr__(self, name, value)
1981 except AttributeError:
1982 pass
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\lib.pyd in pandas.lib.AxisProperty.__set__ (pandas\lib.c:38795)()
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\series.pyc in _set_axis(self, axis, labels, fastpath)
266 object.__setattr__(self, '_index', labels)
267 if not fastpath:
--> 268 self._data.set_axis(axis, labels)
269
270 def _set_subtyp(self, is_all_dates):
C:\Users\eilerj\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\internals.pyc in set_axis(self, axis, new_labels)
2209 if new_len != old_len:
2210 raise ValueError('Length mismatch: Expected axis has %d elements, '
-> 2211 'new values have %d elements' % (old_len, new_len))
2212
2213 self.axes[axis] = new_labels
ValueError: Length mismatch: Expected axis has 3 elements, new values have 4 elements
我認爲這是一個錯誤,我應該修正這裏(https://github.com/pydata/pandas/pull/9699)。你可以檢查一下幹線熊貓來確認嗎? – DSM 2015-03-31 20:52:53
@DSM對不起,我不知道如何檢查樹幹熊貓。這是最新的熊貓(16.0),但它看起來像你可能只在幾天前修復它。我現在就把它留下,但如果我應該刪除這個問題,請讓我知道。 – JohnE 2015-03-31 21:12:18
我投票結束這個問題作爲題外話,因爲這是一個錯誤報告,並會更好地作爲Github問題(雖然這恰好已經修復!)。 :) – 2015-03-31 22:20:52