2017-06-13 75 views
1

我正在進行一項實驗,在該實驗中,我在打開和關閉時測量閥門。我有限位開關指示完全打開並完全關閉。我只對數據在關閉或打開時感興趣。 我的熊貓數據集看起來像這樣(簡化):將數據幀劃分爲多個(連續)時間序列

Time      Flow_A Flow_B  Open closed    
2017-06-12 09:46:31.068 0.000933 295.933070 1 0 
2017-06-12 09:46:31.660 0.287122 292.727820 1 0 
2017-06-12 09:46:32.252 0.256170 288.869600 0 0 
2017-06-12 09:46:32.844 0.052523 284.265850 0 0 
2017-06-12 09:46:33.437 0.367495 278.394200 0 1 
2017-06-12 09:46:34.029 1.956472 270.846450 0 1 
2017-06-12 09:46:34.621 5.265860 260.768250 0 0 
2017-06-12 09:46:35.214 12.328835 248.132450 0 0 
2017-06-12 09:46:35.807 22.592590 232.688620 1 0 
2017-06-12 09:46:36.400 35.768205 214.997420 1 0 
2017-06-12 09:46:36.992 51.623265 195.298150 1 0 
2017-06-12 09:46:37.584 70.855590 174.048000 1 0 

我已經找到了如何讓感興趣的區域,與Python:

mask = (data['Open']==0 & (data['Port_2'] == 0) 
data.loc[mask] 

這會給我:

Time      Flow_A Flow_B  Open closed 
2017-06-12 09:46:32.252 0.256170 288.869600 0 0 
2017-06-12 09:46:32.844 0.052523 284.265850 0 0 
2017-06-12 09:46:34.621 5.265860 260.768250 0 0 
2017-06-12 09:46:35.214 12.328835 248.132450 0 0 

問題是如何將這個分割/劃分/分組/子集成兩個連續的數據集。時間段未知,日誌條目之間的時間間隔不完全相同。我期望連續的數據應該在面具中找到,但我不知道如何去做。

+0

我不知道,如果知道consecutives時間序列 - 你需要分割的面具過濾所有行哪是連續的,例如通過一些新的專欄像我的回答?或者是其他東西? – jezrael

回答

0

我想你需要:

mask = (data['Open']==0) & (data['closed'] == 0) 
data.loc[mask, 'groups'] = mask.ne(mask.shift())[mask].cumsum() 
print (data) 
        Time  Flow_A  Flow_B Open closed groups 
2017-06-12 09:46:31.068 0.000933 295.93307  1  0  NaN 
2017-06-12 09:46:31.660 0.287122 292.72782  1  0  NaN 
2017-06-12 09:46:32.252 0.256170 288.86960  0  0  1.0 
2017-06-12 09:46:32.844 0.052523 284.26585  0  0  1.0 
2017-06-12 09:46:33.437 0.367495 278.39420  0  1  NaN 
2017-06-12 09:46:34.029 1.956472 270.84645  0  1  NaN 
2017-06-12 09:46:34.621 5.265860 260.76825  0  0  2.0 
2017-06-12 09:46:35.214 12.328835 248.13245  0  0  2.0 
2017-06-12 09:46:35.807 22.592590 232.68862  1  0  NaN 
2017-06-12 09:46:36.400 35.768205 214.99742  1  0  NaN 
2017-06-12 09:46:36.992 51.623265 195.29815  1  0  NaN 
2017-06-12 09:46:37.584 70.855590 174.04800  1  0  NaN 

print (data[mask]) 
        Time  Flow_A  Flow_B Open closed groups 
2017-06-12 09:46:32.252 0.256170 288.86960  0  0  1.0 
2017-06-12 09:46:32.844 0.052523 284.26585  0  0  1.0 
2017-06-12 09:46:34.621 5.265860 260.76825  0  0  2.0 
2017-06-12 09:46:35.214 12.328835 248.13245  0  0  2.0 

此外,如果從0需要int strats:

data.loc[mask, 'groups'] = mask.ne(mask.shift())[mask].cumsum() 
data['groups'] = data['groups'].fillna(0).astype(int) - 1 
print (data) 
        Time  Flow_A  Flow_B Open closed groups 
2017-06-12 09:46:31.068 0.000933 295.93307  1  0  -1 
2017-06-12 09:46:31.660 0.287122 292.72782  1  0  -1 
2017-06-12 09:46:32.252 0.256170 288.86960  0  0  0 
2017-06-12 09:46:32.844 0.052523 284.26585  0  0  0 
2017-06-12 09:46:33.437 0.367495 278.39420  0  1  -1 
2017-06-12 09:46:34.029 1.956472 270.84645  0  1  -1 
2017-06-12 09:46:34.621 5.265860 260.76825  0  0  1 
2017-06-12 09:46:35.214 12.328835 248.13245  0  0  1 
2017-06-12 09:46:35.807 22.592590 232.68862  1  0  -1 
2017-06-12 09:46:36.400 35.768205 214.99742  1  0  -1 
2017-06-12 09:46:36.992 51.623265 195.29815  1  0  -1 
2017-06-12 09:46:37.584 70.855590 174.04800  1  0  -1 

print (data[mask]) 
        Time  Flow_A  Flow_B Open closed groups 
2017-06-12 09:46:32.252 0.256170 288.86960  0  0  0 
2017-06-12 09:46:32.844 0.052523 284.26585  0  0  0 
2017-06-12 09:46:34.621 5.265860 260.76825  0  0  1 
2017-06-12 09:46:35.214 12.328835 248.13245  0  0  1 
相關問題