2015-10-27 235 views
6

我重新採樣一個熊貓TimeSeries。時間序列由二進制值(它是一個分類變量)組成,沒有缺失值,但在重新採樣NaN後出現。這怎麼可能?熊貓TimeSeries resample生產NaN

我不能在這裏張貼任何示例數據,因爲它是敏感的信息,但我創建和重新採樣系列如下:

series = pd.Series(data, ts) 
series_rs = series.resample('60T', how='mean') 
+0

如果上採樣,則默認爲引入'NaN'值,除了沒有代表性的樣本代碼,就很難進一步置評 – EdChum

回答

6

upsampling轉換成常規時間間隔,因此,如果沒有樣品你得到NaN

您可以向後填寫缺失值fill_method='bfill'或轉發 - fill_method='ffill'fill_method='pad'

import pandas as pd 

ts = pd.date_range('1/1/2015', periods=10, freq='100T') 
data = range(10) 
series = pd.Series(data, ts) 
print series 
#2015-01-01 00:00:00 0 
#2015-01-01 01:40:00 1 
#2015-01-01 03:20:00 2 
#2015-01-01 05:00:00 3 
#2015-01-01 06:40:00 4 
#2015-01-01 08:20:00 5 
#2015-01-01 10:00:00 6 
#2015-01-01 11:40:00 7 
#2015-01-01 13:20:00 8 
#2015-01-01 15:00:00 9 
#Freq: 100T, dtype: int64 
series_rs = series.resample('60T', how='mean') 
print series_rs 
#2015-01-01 00:00:00  0 
#2015-01-01 01:00:00  1 
#2015-01-01 02:00:00 NaN 
#2015-01-01 03:00:00  2 
#2015-01-01 04:00:00 NaN 
#2015-01-01 05:00:00  3 
#2015-01-01 06:00:00  4 
#2015-01-01 07:00:00 NaN 
#2015-01-01 08:00:00  5 
#2015-01-01 09:00:00 NaN 
#2015-01-01 10:00:00  6 
#2015-01-01 11:00:00  7 
#2015-01-01 12:00:00 NaN 
#2015-01-01 13:00:00  8 
#2015-01-01 14:00:00 NaN 
#2015-01-01 15:00:00  9 
#Freq: 60T, dtype: float64 
series_rs = series.resample('60T', how='mean', fill_method='bfill') 
print series_rs 
#2015-01-01 00:00:00 0 
#2015-01-01 01:00:00 1 
#2015-01-01 02:00:00 2 
#2015-01-01 03:00:00 2 
#2015-01-01 04:00:00 3 
#2015-01-01 05:00:00 3 
#2015-01-01 06:00:00 4 
#2015-01-01 07:00:00 5 
#2015-01-01 08:00:00 5 
#2015-01-01 09:00:00 6 
#2015-01-01 10:00:00 6 
#2015-01-01 11:00:00 7 
#2015-01-01 12:00:00 8 
#2015-01-01 13:00:00 8 
#2015-01-01 14:00:00 9 
#2015-01-01 15:00:00 9 
#Freq: 60T, dtype: float64 
+0

THX。那解決了它 –

+0

超級。你可以upvote或接受它 - [info](http://stackoverflow.com/tour) – jezrael

+0

不同的填充方法做什麼? 關於它們的熊貓文檔相當有限。 ffilll和bfill是不言自明的,但是墊子呢? –