0
比方說,我有下面的用戶數據停留在一個酒店:串聯在大熊貓
end start uid
0 2014-01-02 00:00:00 2014-01-01 00:00:00 1
1 2014-01-04 00:00:00 2014-01-02 00:00:00 1
2 2014-02-02 00:00:00 2014-02-01 00:00:00 1
3 2014-01-02 00:00:00 2014-01-01 00:00:00 3
而且我想串聯,通過同一用戶的1天或更少(不同的連續住宿),從而有效地創建以下數據幀:
end start uid
0 2014-01-04 00:00:00 2014-01-01 00:00:00 1
2 2014-02-02 00:00:00 2014-02-01 00:00:00 1
3 2014-01-02 00:00:00 2014-01-01 00:00:00 3
第一步驟將是groupby("uid")
。但是,如何遍歷每個組的行,以便我可以使用pandas工具箱進行連接?
爲方便起見,這裏是數據幀的最小初始化:
import pandas as pd
from datetime import datetime
data = pd.DataFrame([{"uid":1,"start":datetime(year=2014,month=1,day=1),"end":datetime(year=2014,month=1,day=2)},{"uid":1,"start":datetime(year=2014,month=1,day=2),"end":datetime(year=2014,month=1,day=4)},{"uid":1,"start":datetime(year=2014,month=2,day=1),"end":datetime(year=2014,month=2,day=2)},{"uid":3,"start":datetime(year=2014,month=1,day=1),"end":datetime(year=2014,month=1,day=2)}])