蟒蛇每日數據從分鐘的數據

我有一個分鐘的索引= datetime.datetime數據框。我想運行一個循環，在每次迭代中，我只想在給定的一天中獲取數據。有沒有更好的辦法從下面這樣做除了：什麼「數據」 DF看起來像蟒蛇每日數據從分鐘的數據

data['index_date'] = data['index'].apply(lambda dt: datetime.datetime(dt.year, dt.month, dt.day, 0,0)) 

days= data['index_date'].unique() 

for day is days: 
    data_day= data[data['index_date']==day]

只是一個樣本：

>>> data 
Out[8]: 
     index    90 180 
2016-01-04 02:30:00-05:00 1.000 1.000 
2016-01-04 02:31:00-05:00 1.000 1.000 
2016-01-04 02:32:00-05:00 1.000 1.000 
2016-01-04 02:33:00-05:00 1.000 1.000 
2016-01-04 02:34:00-05:00 1.000 1.000 

...       ... ... 
2016-07-26 12:51:00-04:00 1.000 1.000 
2016-07-26 12:52:00-04:00 1.000 1.000 
2016-07-26 12:53:00-04:00 1.000 1.000 
2016-07-26 12:54:00-04:00 1.000 1.000 
2016-07-26 12:55:00-04:00 1.000 1.000 
2016-07-26 12:56:00-04:00 1.000 1.000

來源

2016-11-09 dayum

你想在'data_day'做什麼？你想要平均水平嗎？他們全部？如果你想要所有他們，你爲什麼分組呢？通常你想分組以便聚合或變換。你想做什麼？ – piRSquared

我會做不同的東西。對於初學者，我需要對每天的數據進行PCA分析並獲得第一個特徵向量。 – dayum

然後這就是你在我的答案中傳遞給apply函數的內容。 – piRSquared

考慮df

n = 10000 
df = pd.DataFrame({'index': pd.date_range('2010-01-01', periods=n, freq='T'), 
        90: np.random.rand(n) * 10, 
        100: np.random.randn(n) * 100})

然後你可以得到一個天的字典

g = df.set_index('index').groupby(pd.TimeGrouper('D')) 
d = {k: v for k, v in g}

或面板

p = pd.Panel(d)

或者一個數據幀

dfg = pd.concat(d.values, keys=d.keys())

來源

2016-11-09 01:37:30 piRSquared

@dayum你必須做一些工作來展示你想要的東西。 – piRSquared

蟒蛇每日數據從分鐘的數據

回答

相關問題