我想創建一個DataFrame
包含了許多不同Series
子類我已經定義。然而,當分配給DataFrame
時,似乎該子類從Series
中被剝離。Python的大熊貓:店子系列的數據幀列
這裏有一個玩具的例子來說明這個問題:
>>> import pandas as pd
>>> class SeriesSubclass(pd.Series):
... @property
... def _constructor(self):
... return SeriesSubclass
... def times_two(self):
... """Method I need in this subclass."""
... return self * 2
...
>>> subclass = SeriesSubclass([7, 8, 9])
>>> type(subclass) # fine
<class '__main__.SeriesSubclass'>
>>> subclass.times_two() # fine
0 14
1 16
2 18
dtype: int64
>>>
>>> data = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list('ABC'))
>>> data['D'] = subclass
>>> type(data['D']) # not good
<class 'pandas.core.series.Series'>
>>> data['D'].times_two() # not good
Traceback (most recent call last):
...
AttributeError: 'Series' object has no attribute 'times_two'
我已經看到了這個問題,可能是以前#1713募集,但我不能辨別實際的解決方案。作爲一個如此龐大的圖書館,它很難遵循各種PR,文檔版本等。而且,我所知道的子類化機制似乎並沒有被很好地描述(this seems to be it)。