2010-08-28 74 views
7

我有一個2D numpy.array,其中第一列包含datetime.datetime對象,第二列整數:如何按日期過濾numpy.ndarray?

A = array([[2002-03-14 19:57:38, 197], 
     [2002-03-17 16:31:33, 237], 
     [2002-03-17 16:47:18, 238], 
     [2002-03-17 18:29:31, 239], 
     [2002-03-17 20:10:11, 240], 
     [2002-03-18 16:18:08, 252], 
     [2002-03-23 23:44:38, 327], 
     [2002-03-24 09:52:26, 334], 
     [2002-03-25 16:04:21, 352], 
     [2002-03-25 18:53:48, 353]], dtype=object) 

我想這樣做的是選擇所有行的具體日期,像

A[first_column.date()==datetime.date(2002,3,17)] 
array([[2002-03-17 16:31:33, 237], 
      [2002-03-17 16:47:18, 238], 
      [2002-03-17 18:29:31, 239], 
      [2002-03-17 20:10:11, 240]], dtype=object) 

我該如何做到這一點?

感謝您的見解:)

回答

4

你可以這樣做:

from_date=datetime.datetime(2002,3,17,0,0,0) 
to_date=from_date+datetime.timedelta(days=1) 
idx=(A[:,0]>from_date) & (A[:,0]<=to_date) 
print(A[idx]) 
# array([[2002-03-17 16:31:33, 237], 
#  [2002-03-17 16:47:18, 238], 
#  [2002-03-17 18:29:31, 239], 
#  [2002-03-17 20:10:11, 240]], dtype=object) 

A[:,0]A第一列。

不幸的是,比較A[:,0]datetime.date對象會引發TypeError。然而,隨着datetime.datetime對象比較工作:

In [63]: A[:,0]>datetime.datetime(2002,3,17,0,0,0) 
Out[63]: array([False, True, True, True, True, True, True, True, True, True], dtype=bool) 

而且,不幸的是,

datetime.datetime(2002,3,17,0,0,0)<A[:,0]<=datetime.datetime(2002,3,18,0,0,0) 

引發一個TypeError過,因爲這要求datetime.datetime__lt__方法,而不是numpy的陣列的__lt__方法。也許這是一個錯誤。

無論如何,解決問題並不難;你可以說

In [69]: (A[:,0]>datetime.datetime(2002,3,17,0,0,0)) & (A[:,0]<=datetime.datetime(2002,3,18,0,0,0)) 
Out[69]: array([False, True, True, True, True, False, False, False, False, False], dtype=bool) 

因爲這給你一個布爾數組,你可以使用它作爲一個「神奇指數」來A,其產生期望的結果。

2
from datetime import datetime as dt, timedelta as td 
import numpy as np 

# Create 2-d numpy array 
d1 = dt.now() 
d2 = dt.now() 
d3 = dt.now() - td(1) 
d4 = dt.now() - td(1) 
d5 = d1 + td(1) 
arr = np.array([[d1, 1], [d2, 2], [d3, 3], [d4, 4], [d5, 5]]) 

# Here we will extract all the data for today, so get date range in datetime 
dtx = d1.replace(hour=0, minute=0, second=0, microsecond=0) 
dty = dtx + td(hours=24) 

# Condition 
cond = np.logical_and(arr[:, 0] >= dtx, arr[:, 0] < dty) 

# Full array 
print arr 
# Extracted array for the range 
print arr[cond, :] 
+0

+1將我指向datetime.replate() – 2011-12-10 18:38:39