2
我期待做到以下幾點:通過滾動對象將多個滾動函數應用於熊貓羣組的多個列?
集團數據幀
對於每個組,生成時間窗口(給定的時間單位)
在所產生的結構,採取一切列並應用多個滾動彙總統計函數,以便結果具有每個組/時間窗組合的彙總統計信息。
下面是一個例子的數據集:
gps_time,name,val_x,val_y
2017-07-04 11:20:23.423,bob,0.963,0.201
2017-07-04 11:20:24.492,bob,0.964,0.203
2017-07-04 11:20:24.499,bob,0.962,0.210
2017-07-04 11:20:25.627,sarah,0.893,0.010
2017-07-04 11:20:28.627,sarah,0.894,0.012
2017-07-04 11:20:29.613,sarah,0.895,0.014
2017-07-04 11:20:29.630,larry,-0.423,0.231
2017-07-04 11:20:30.423,larry,-0.431,0.22
2017-07-04 11:20:30.428,larry,-0.432,0.222
而對於上述數據的期望的輸出,通過名稱和與1秒的窗口分組:
name,gps_time,val_x_mean,val_x_med,val_y_mean,val_y_med
bob,2017-07-04 11:20:23.423,0.963,0.963,0.201,0.201
bob,2017-07-04 11:20:24.492,0.963,0.963,0.2065,0.2065
sarah,2017-07-04 11:20:25.627,0.893,0.89,0.010,0.010
sarah,2017-07-04 11:20:28.627,0.8945,0.8945,0.013,0.013
larry,2017-07-04 11:20:30.423,-0.4287,-0.431,0.336,0.222
我已經嘗試使用列表理解來生成一堆數據幀,但這個過程非常慢,我必須爲每一列調用它。
這是完美的!我如何指定跨區間的特定百分比重疊? –
解釋重疊百分比,我將如何計算它? –
50%重疊意味着給定兩個區間,第k個區間的最後50%是第(k + 1)個區間的前50%。例如,如果我們有[1,2,3,4,5,6,7,8]列表,與4個觀測窗口重疊50%的區間會導致[1,2,3,4] ,[3,4,5,6],[5,6,7,8]。 –