2016-08-17 85 views
-3

我有一些nan的字符串系列,我想替換一些字符,然後把它變成int(float是好的),但是nan仍然是nan。像如何處理大熊貓的數據而不影響nan?

In[1]:df = pd.DataFrame(["type 12", None, "type13"], columns=['A']) 
Out[1]: 
    A 
0 12 
1 NaN 
2 13 

有什麼好方法可以做到嗎?

+0

你可以提供一個代碼片段? – zarak

+0

請檢查[如何使良好的可重複熊貓示例](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)並添加所需的輸出。謝謝。 – jezrael

+0

[NumPy或Pandas:將數組類型保持爲具有NaN值的整數]可能的重複(http://stackoverflow.com/questions/11548005/numpy-or-pandas-keeping-array-type-as-integer-while -having-a-nan-value) –

回答

1

不,不幸的是。你將不得不解決floats

>>> s = pd.Series(['1', '2', '3', '4', '5'], index=list('abcde')) 
>>> s 
a 1 
b 2 
c 3 
d 4 
e 5 
dtype: object 
>>> s = s.reindex(['a','b','c','f','u']) 
>>> s 
a  1 
b  2 
c  3 
f NaN 
u NaN 
dtype: object 
>>> s.astype(int) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py", line 2947, in astype 
    raise_on_error=raise_on_error, **kwargs) 
    File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 2873, in astype 
    return self.apply('astype', dtype=dtype, **kwargs) 
    File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 2832, in apply 
    applied = getattr(b, f)(**kwargs) 
    File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 422, in astype 
    values=values, **kwargs) 
    File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 465, in _astype 
    values = com._astype_nansafe(values.ravel(), dtype, copy=True) 
    File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/common.py", line 2628, in _astype_nansafe 
    return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape) 
    File "pandas/lib.pyx", line 937, in pandas.lib.astype_intsafe (pandas/lib.c:16620) 
    File "pandas/src/util.pxd", line 60, in util.set_value_at (pandas/lib.c:67979) 
ValueError: cannot convert float NaN to integer 

從熊貓Caveats and Gotchas

特殊值NaN(非-A-號)是處處用作NA 值,並且有API函數ISNULL和可以使用的NOTNULL 跨越dtype以檢測NA值。

然而,與它來了幾個我最 肯定沒有忽視取捨......在沒有高性能的NA 支持被內置到NumPy的從地上爬起來,主要 傷亡是在整數數組中表示NAs的能力。

所以用這個工作:

>>> s.astype(float) 
a 1.0 
b 2.0 
c 3.0 
f NaN 
u NaN 
dtype: float64 
+0

我需要做一些預處理。有沒有什麼好的方法來做到這一點? – modkzs

+0

@modkzs所以你說你的列值字面上包含「類型12」作爲字符串,你想提取「12」例如? –

+0

@JonClements是的 – modkzs