2016-05-16 76 views
2

的閾值I都含有在度的角度變化值以下數據幀,在多個行:R:過濾器/子集數據幀到變化

'data.frame': 712801 obs. of 4 variables: 
$ time_passed: int 1 2 3 4 5 6 7 8 9 10 ... 
$ dRoll  : num 0.9798 -0.5099 -0.0974 -0.4985 0.1719 ... 
$ dPitch  : num -0.175 -0.0655 0.0653 0.8907 -1.0893 ... 
$ dYaw  : num 0.33232 0.06875 -0.00573 0.59588 -0.55577 ... 

> myData[1:20,] 
time_passed  dRoll  dPitch  dYaw 
     1   0.97975783 -0.17498131 0.332315521 
     2   -0.50993244 -0.06548908 0.068754935 
     3   -0.09740283 0.06531719 -0.005729578 
     4   -0.49847328 0.89072019 0.595876107 
     5   0.17188734 -1.08930736 -0.555769061 
     6   0.68181978 0.36852645 0.492743704 
     7   1.07143108 0.15206300 -0.635983153 
     8   -1.43812407 -0.76638835 -0.509932438 
     9   0.43544792 0.41241502 0.767763445 
     10   0.25210143 0.61375239 0.509932438 
     11   0.38961130 0.01203211 -0.360963411 
     12   0.03437747 -0.29633377 -0.315126787 
     13   -0.33804510 -0.40639896 -0.177616916 
     14   0.68181978 0.32446600 0.435447924 
     15   -1.12872686 -0.37752189 -0.275019742 
     16   0.75057471 0.33907642 0.464095814 
     17   -0.25783101 0.11310187 0.309397209 
     18   -0.01718873 -0.13435860 -0.521391594 
     19   0.12605071 0.12817066 -0.085943669 
     20   0.02291831 -0.59856901 -0.120321137 

我將如何編寫類似

「如果隨後的負值(或正值)的總和小於我的閾值(例如5°變化),則將其從數據集「

在R代碼?

我想這個標準適用於任何行,所以dRolldPitchdYaw


在這種情況下,施加有以滑稽列,輸出將是:

time_passed  dRoll  dPitch  dYaw 
     1   0.97975783 -0.17498131 0.332315521 
     5   0.17188734 -1.08930736 -0.555769061 
     6   0.68181978 0.36852645 0.492743704 
     7   1.07143108 0.15206300 -0.635983153 
     9   0.43544792 0.41241502 0.767763445 
     10   0.25210143 0.61375239 0.509932438 
     11   0.38961130 0.01203211 -0.360963411 
     12   0.03437747 -0.29633377 -0.315126787 
     14   0.68181978 0.32446600 0.435447924 
     16   0.75057471 0.33907642 0.464095814 
     19   0.12605071 0.12817066 -0.085943669 
     20   0.02291831 -0.59856901 -0.120321137 

在滑稽所有負運行被拋出,因爲隨後的負值的總和是小於5度:

  • 首先在滑稽負運行:sum(myData[2:4,2]) = -1.105809
  • 第二,第三和來回奔跑是隻有一個號碼:-1.43812-0.33804-1.12872
  • 最後運行在滑稽:sum(myData[17:18,2]) = -0.2750197

如何將一個做到這一點的R'

+0

你可以發佈你想要的輸出? –

+0

您剛在dRoll中濾除了具有負值的行。也許你可以詳細說明這一點,例如一步一步計算? –

+0

@ M.D,我試着這樣做,我希望現在更清楚我正在嘗試做什麼。問題是,如果其中一個負面運行的總和超過了我的閾值,它將不得不留在數據幀中。 – Joris

回答

3

我的建議是首先將數據幀融入長格式。之後,您可以更輕鬆地進行分組操作。

使用data.table包(我們需要爲meltrleid功能):

# load the package 
library(data.table) 

# melt into long format 
DT2 <- melt(DT, id = 'time_passed') 

# create a cummulative sum for each run 
# 'rleid(value > 0)' creates a grouping variable for runs of consecutive positive/negative values 
# by adding '[.N]' to 'cumsum(value)' you set all values in 'csum' to the highest value 
# for each run, which we can use to filter the data 
DT2[, csum := cumsum(value)[.N], by = .(variable, rleid(value > 0))] 

# filter the data according to a rule 
# in this case only the values between -1.2 and -0.2 are filtered out 
DT2[csum < -1.2 | csum > -0.2] 

這給(結果的快照):

time_passed variable  value   csum 
1:   1 dRoll 0.979757830 0.979757830 
2:   5 dRoll 0.171887340 1.925138200 
3:   6 dRoll 0.681819780 1.925138200 
4:   7 dRoll 1.071431080 1.925138200 
5:   8 dRoll -1.438124070 -1.438124070 
6:   9 dRoll 0.435447920 1.111538120 
.... 
.... 
14:   3 dPitch 0.065317190 0.956037380 
15:   4 dPitch 0.890720190 0.956037380 
16:   6 dPitch 0.368526450 0.520589450 
17:   7 dPitch 0.152063000 0.520589450 
18:   9 dPitch 0.412415020 1.038199520 
19:   10 dPitch 0.613752390 1.038199520 
.... 
.... 
26:   1  dYaw 0.332315521 0.401070456 
27:   2  dYaw 0.068754935 0.401070456 
28:   3  dYaw -0.005729578 -0.005729578 
29:   4  dYaw 0.595876107 0.595876107 
30:   6  dYaw 0.492743704 0.492743704 
31:   9  dYaw 0.767763445 1.277695883 
+0

謝謝,這就是我需要的! – Joris