2016-11-04 41 views
1

我有一個熊貓數據框與兩個'日期時間'列t1,t2。現在我需要在數據幀過濾掉所有行,其中T1 < = T2 T2可以 熊貓過濾日期時間列不包括

之前熊貓0.19.0 我能做到這一點楠

:熊貓0.19.0這個代碼後

import pandas as pd 
from datetime import datetime 
dt = datetime.utcnow() 
dt64 = np.datetime64(dt) 
df = pd.DataFrame([(dt64,None)], columns=['t1','t2']) 
df[(df.t1<=df.t2)] 

失敗

Traceback (most recent call last): 
    File "workspace/python/MyTests/test1.py", line 87, in <module> 
    testDfTimeCompare() 
    File "workspace/python/MyTests/test1.py", line 80, in testDfTimeCompare 
    df[(df.t1<=df.t2)] 
    File "anaconda/lib/python2.7/site-packages/pandas/core/ops.py", line 813, in wrapper 
    return self._constructor(na_op(self.values, other.values), 
    File "anaconda/lib/python2.7/site-packages/pandas/core/ops.py", line 787, in na_op 
    y = y.view('i8') 
    File "anaconda/lib/python2.7/site-packages/numpy/core/_internal.py", line 367, in _view_is_safe 
    raise TypeError("Cannot change data-type for object array.") 
TypeError: Cannot change data-type for object array. 

實現此目的的最佳方法是什麼?

回答

2

我認爲你需要投None將列t2to_datetimeNaT,則可以使用更快的功能Series.le什麼是一樣的<=

df.t2 = pd.to_datetime(df.t2) 
print (df) 
          t1 t2 
0 2016-11-04 07:24:53.372838 NaT 

mask = df.t1.le(df.t2) 
print (mask) 
0 False 
dtype: bool 

mask = df.t1 <= df.t2 
print (mask) 
0 False 
dtype: bool 
0

做一些面膜是這樣的:

mask = ((df <= 0).cumsum() > 0).any() 
>>> mask 
t1 False 
t2  True 
dtype: bool