pd.Series/DataFrame每週的第一個值

假設我有一個pd.Series 每天 S & P 500的值，我想過濾這個系列以獲得第一個營業日和相關的值每週。因此，例如，我的過濾系列將包含2017年9月5日（週二 - 週一沒有價值），然後是2017年9月11日（週一）。pd.Series/DataFrame每週的第一個值

Source series: 
2017-09-01 2476.55 
2017-09-05 2457.85 
2017-09-06 2465.54 
2017-09-07 2465.10 
2017-09-08 2461.43 
2017-09-11 2488.11 
2017-09-12 2496.48 

Filtered series 
2017-09-01 2476.55 
2017-09-05 2457.85 
2017-09-11 2488.11

我的解決方案目前包括：

mask  = SP500.apply(lambda row: SP500[row.name - datetime.timedelta(days=row.name.weekday()):].index[0], axis=1).unique() 
filtered = SP500.loc[mask]

但是，這感覺不理想/非Python的。任何更好/更快/清潔的解決方案？

來源

2017-10-20 David Schenck

爲什麼'2017-09-01'不包括 – Wen

也許你可以創建一個熊貓數據框架並使用groupby並採用本週的第一個元素？ – Michal

@wen是01-09-2017將包括是的 –

使用resample上pd.Series.index.to_series

s[s.index.to_series().resample('W').first()] 

2017-09-01 2476.55 
2017-09-05 2457.85 
2017-09-11 2488.11 
dtype: float64

來源

2017-10-20 23:04:26 piRSquared

由於系列的.apply方法無法訪問索引，並且沒有axis參數，所以我不確定您提供的解決方案是否有效。你給一個數據幀會的工作，但是這是簡單的，如果你有一個數據幀：

#Make some fake data 
x = pd.DataFrame(pd.date_range(date(2017, 10, 9), date(2017, 10, 23)), columns = ['date']) 
x['value'] = x.index 
print(x) 
     date value 
0 2017-10-09  0 
1 2017-10-10  1 
2 2017-10-11  2 
3 2017-10-12  3 
4 2017-10-13  4 
5 2017-10-14  5 
6 2017-10-15  6 
7 2017-10-16  7 
8 2017-10-17  8 
9 2017-10-18  9 
10 2017-10-19  10 
11 2017-10-20  11 
12 2017-10-21  12 
13 2017-10-22  13 
14 2017-10-23  14 

#filter 
filtered = x.groupby(x['date'].apply(lambda d: d-timedelta(d.weekday())), as_index = False).first() 
print(filtered) 
     date value 
0 2017-10-09  0 
1 2017-10-16  7 
2 2017-10-23  14

來源

2017-10-20 18:38:34

df.sort_index().assign(week=df.index.get_level_values(0).week).drop_duplicates('week',keep='first').drop('week',1) 
Out[774]: 
       price 
2017-09-01 2476.55 
2017-09-05 2457.85 
2017-09-11 2488.11

來源

2017-10-20 19:39:40 Wen

pd.Series/DataFrame每週的第一個值

回答

相關問題