如何解決與Series.fillna（）相關的熊貓問題？

我剛剛從Pandas 0.11升級到0.13.0rc1。整理引起了與Series.fillna（）相關的一個錯誤。如何解決與Series.fillna（）相關的熊貓問題？

>>> df 
        sales net_pft 
STK_ID RPT_Date     
600809 20060331 5.8951 1.1241 
     20060630 8.3031 1.5464 
     20060930 11.9084 2.2990 
     20061231  NaN 2.6060 
     20070331 5.9129 1.3334 

[5 rows x 2 columns] 
>>> type(df['sales']) 
<class 'pandas.core.series.Series'> 
>>> df['sales'] = df['sales'].fillna(df['net_pft']) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "D:\Python27\lib\site-packages\pandas\core\generic.py", line 1912, in fillna 
    obj.fillna(v, inplace=True) 
AttributeError: 'numpy.float64' object has no attribute 'fillna' 
>>>

爲什麼df['sales']成爲'numpy.float64'對象時，它fillna()使用？如何正確地「用另一列的值填充一列的NaN」？

來源

2013-12-17 bigbug

有一個最近的討論，這是在熊貓大師修復：https://github.com/pydata/pandas/issues/5703 – joris

有這個最近的討論，並且它是固定在大熊貓主：https://github.com/pydata/pandas/issues/5703（後0.13rc1釋放，所以這將是固定在最後0.13）。

注意：行爲改變了！這是不支持熊貓行爲< = 0.12，因爲@ behzad.nouri指出（使用系列作爲fillna的輸入）。然而，它確實有效，但是顯然是基於位置，這是錯誤的。但只要這兩個系列（您的案例中的df['sales']和df['net_pft']）具有相同的索引，則這無關緊要。
在熊貓0.13中，它將被支持，但基於系列的索引。在這裏看到評論：https://github.com/pydata/pandas/issues/5703#issuecomment-30663525

來源

2013-12-17 14:31:34 joris

感謝您的回答。 – bigbug

似乎更像你正在嘗試做的是：

idx = df['sales'].isnull() 
df['sales'][ idx ] = df['net_pft'][ idx ]

，因爲你所提供的value參數fillna是一個系列，該代碼進入波紋管分支，並將每個指標要求fillna提供的系列的項目。如果self是一個DataFrame，這將會正常工作，也就是每個列使用提供的系列fillna，但由於self這裏是一個系列，它會中斷。

如documentation到fillna一個數據幀的參數值可以是

交替值指定的一個字典用於每個列，其值（在字典不列將不被填充）。

從下面的源代碼，如果value是一個系列，將工作方式相同使用系列的索引作爲鍵fillna相應列的字典。

else: # value is not None 
     if method is not None: 
      raise ValueError('cannot specify both a fill method and value') 

     if len(self._get_axis(axis)) == 0: 
      return self 
     if isinstance(value, (dict, com.ABCSeries)): 
      if axis == 1: 
       raise NotImplementedError('Currently only can fill ' 
              'with dict/Series column ' 
              'by column') 

      result = self if inplace else self.copy() 
      for k, v in compat.iteritems(value): 
       if k not in result: 
        continue 
       obj = result[k] 
       obj.fillna(v, inplace=True) 
      return result 
     else: 
      new_data = self._data.fillna(value, inplace=inplace, 
             downcast=downcast)

來源

2013-12-17 12:06:06

請注意，它工作正常0.12，所以我'不知道它是不是一個錯誤 – alko

你是對的使用系列不支持，雖然它的確行得通（OP在開發0.13時引入的錯誤），你引用的文檔是針對DataFrame的，而不是一個系列（OP有一個系列）。但從0.13開始，系列的使用將被支持，並且DataFrame的工作將按照您的說明進行。 – joris

@joris系列0.13的系列文檔是OP的問題，與上面鏈接的文檔相同（請參見[這裏]（http://pandas.pydata.org/pandas-docs/dev/generated/pandas.Series .fillna.html＃pandas.Series.fillna）） –

如何解決與Series.fillna（）相關的熊貓問題？

回答

相關問題