熊貓數據幀失敗的指標，但系列成功

考慮到美國市場的時間：熊貓數據幀失敗的指標，但系列成功

In [220]: market_hours = pandas.date_range(date + ' 09:30:00', date + ' 16:00:00', freq='15min', tz='US/Eastern').tz_convert('UTC') 

In [221]: market_hours 
Out[221]: 
<class 'pandas.tseries.index.DatetimeIndex'> 
[2014-04-29 13:30:00+00:00, ..., 2014-04-29 20:00:00+00:00] 
Length: 27, Freq: 15T, Timezone: UTC

我可以resample()單場，並限制這些市場的時間：

In [222]: df.set_index('localtime')['size'].resample('15min', how='sum')[market_hours] 
Out[222]: 
2014-04-29 13:30:00+00:00 1093142 
2014-04-29 13:45:00+00:00  556664 
2014-04-29 14:00:00+00:00  467662 
2014-04-29 14:15:00+00:00  460966 
2014-04-29 14:30:00+00:00  275805 
2014-04-29 14:45:00+00:00  192709 
2014-04-29 15:00:00+00:00  226375 
2014-04-29 15:15:00+00:00  175065 
2014-04-29 15:30:00+00:00  181047 
2014-04-29 15:45:00+00:00  129644 
2014-04-29 16:00:00+00:00  193330 
2014-04-29 16:15:00+00:00  170046 
2014-04-29 16:30:00+00:00  130674 
2014-04-29 16:45:00+00:00  107118 
2014-04-29 17:00:00+00:00  156699 
2014-04-29 17:15:00+00:00  153912 
2014-04-29 17:30:00+00:00  180449 
2014-04-29 17:45:00+00:00  223318 
2014-04-29 18:00:00+00:00  211324 
2014-04-29 18:15:00+00:00  152374 
2014-04-29 18:30:00+00:00  121876 
2014-04-29 18:45:00+00:00  90891 
2014-04-29 19:00:00+00:00  138222 
2014-04-29 19:15:00+00:00  167571 
2014-04-29 19:30:00+00:00  264658 
2014-04-29 19:45:00+00:00  492528 
2014-04-29 20:00:00+00:00  8354 
Freq: 15T, Name: size, dtype: int64

但是，如果我嘗試resample()一組字段，我得到一個錯誤：

In [223]: df.set_index('localtime')[['size']].resample('15min', how='sum')[market_hours] 
... 

KeyError: "['2014-04-29T09:30:00.000000000-0400' '2014-04-29T09:45:00.000000000-0400'\n '2014-04-29T10:00:00.000000000-0400' '2014-04-29T10:15:00.000000000-0400'\n '2014-04-29T10:30:00.000000000-0400' '2014-04-29T10:45:00.000000000-0400'\n '2014-04-29T11:00:00.000000000-0400' '2014-04-29T11:15:00.000000000-0400'\n '2014-04-29T11:30:00.000000000-0400' '2014-04-29T11:45:00.000000000-0400'\n '2014-04-29T12:00:00.000000000-0400' '2014-04-29T12:15:00.000000000-0400'\n '2014-04-29T12:30:00.000000000-0400' '2014-04-29T12:45:00.000000000-0400'\n '2014-04-29T13:00:00.000000000-0400' '2014-04-29T13:15:00.000000000-0400'\n '2014-04-29T13:30:00.000000000-0400' '2014-04-29T13:45:00.000000000-0400'\n '2014-04-29T14:00:00.000000000-0400' '2014-04-29T14:15:00.000000000-0400'\n '2014-04-29T14:30:00.000000000-0400' '2014-04-29T14:45:00.000000000-0400'\n '2014-04-29T15:00:00.000000000-0400' '2014-04-29T15:15:00.000000000-0400'\n '2014-04-29T15:30:00.000000000-0400' '2014-04-29T15:45:00.000000000-0400'\n '2014-04-29T16:00:00.000000000-0400'] not in index"

有沒有辦法訪問t他在日期範圍內產生DataFrame？這似乎與時區沒有任何關係。

來源

2014-04-30 chrisaycock

在第一種情況下，您正在索引一個系列。在第二種情況下（使用df[['size']].resample(..，請注意雙方括號），您正在使用DataFrame。
DataFrame上的基本索引（df[labels]）將索引列，而不是行（請參閱http://pandas.pydata.org/pandas-docs/stable/indexing.html#basics）。出於這個原因，你會得到標籤不在（列）索引中的錯誤。

爲了克服這一點，你可以使用loc（假設result是重採樣的結果）：

result.loc[market_hours, :]

來源

2014-04-30 20:11:16 joris

熊貓數據幀失敗的指標，但系列成功

回答

相關問題