2
我有一些數據代表許多不同網站的時間結果。我想找到結果的四分位數,以及每個站點的最大和最小日期。python(熊貓):重組groupby語句
發現每一項都是很容易的:
#quartiles
q = df.groupby(['site_id', 'datum']).quantile([0.25,0.5,0.75])
#max and min vlaues
d_max = df.groupby(['site_id', 'datum']).max()
d_min = df.groupby(['site_id', 'datum']).min()
結果是多指數dataframes。我怎樣才能將它們連接在一起以獲得每個site_id和datum組合的全部3個值?
一些樣本數據:
from io import StringIO
import pandas as pd
TESTDATA=StringIO(u'''date site_id datum result
1968-01-10 RN004481 SWL 61.23
1977-06-07 RN004481 SWL 60.16
1979-12-12 RN004481 SWL 58.76
1971-04-24 RN004482 SWL 79.93
1971-09-29 RN004482 SWL 79.97
1995-09-19 RN004482 SWL 92.91
1996-02-08 RN004482 SWL 93.15
1964-10-29 RN00448411 SWL 67.87
1965-03-04 RN004687 SWL 74.90
1993-03-16 RN02528611 SWL 7.50
2011-10-24 RN029429 SWL 2.59
2011-11-05 RN029429 SWL 2.68
1992-06-24 RN004464 SWL 52.24
1986-08-11 RN004482 SWL 86.84
1998-01-29 RN004482 SWL 94.33
1966-11-24 RN004687 DTW 75.16
1978-08-30 RN004687 SWL 78.24
1983-02-22 RN004687 DTW 81.00
1984-07-24 RN004687 SWL 81.26
1993-07-07 RN004687 SWL 87.18
1994-04-08 RN004687 DTW 87.53
1994-08-11 RN004687 SWL 87.41
2001-01-10 RN004687 SWL 92.04
2010-11-15 RN004687 SWL 97.06
1964-10-01 RN004693 SWL 59.56
1965-06-03 RN004693 SWL 59.74
1967-05-19 RN004693 SWL 59.58
1967-06-23 RN004693 RSWL 59.61
1967-09-22 RN004693 RSWL 59.69
1970-12-16 RN004693 DTW 59.54
''')
df = pd.read_csv(TESTDATA, delim_whitespace=True)
不錯,拆散了我缺少的東西。 – jprockbelly