我有一個形狀爲[t,z,x,y]的numpy數組,表示每小時時間序列的三維數據。數組的軸是時間,垂直座標,水平座標1,水平座標2.還有一個每小時datetime.datetime時間戳的t元素列表。numpy:按組的聚合4D陣列
我想計算每天每日中午的意思。這將是[nday,Z,X,Y]數組。
我試圖找到一種pythonic方式來做到這一點。我寫了一些有用的for循環,但看起來很慢,不靈活和冗長。
在我看來,熊貓不是我的解決方案,因爲我的時間序列數據是三維的。我很樂意被證明是錯誤的。
我想出了這個,使用itertools,找到中日時間戳和按日期分組,他們現在我來試圖申請imap找到手段。
import numpy as np
import pandas as pd
import itertools
# create 72 hours of pseudo-data with 3 vertical levels and a 4 by 4
# horizontal grid.
data = np.zeros((72, 3, 4, 4))
t = pd.date_range(datetime(2008,7,1), freq='1H', periods=72)
for i in range(data.shape[0]):
data[i,...] = i
# find the timestamps that are "midday" in North America. We'll
# define midday as between 15:00 and 23:00 UTC, which is 10:00 EST to
# 15:00 PST.
def is_midday(this_t):
return ((this_t.hour >= 15) and (this_t.hour <= 23))
# group the midday timestamps by date
for dt, grp in itertools.groupby(itertools.ifilter(is_midday, t),
key=lambda x: x.date()):
print 'date ' + str(dt)
for g in grp:
print g
# find means of mid-day data by date
data_list = np.split(data, data.shape[0])
grps = itertools.groupby(itertools.ifilter(is_midday, t),
key=lambda x: x.date())
# how to apply itertools.imap (or something else) to data_list and
# grps? Or somehow split data along axis 0 according to grps?