2017-10-04 149 views
0

忽略類似行我有一個包含日誌和時間戳,例如表:SQL查詢,從同一個表

timestmp log_error 
1507031197631 Er7 
1507031197621 Er8 
1507031197409 Er9 
1506888444602 Er10 
1506880074401 Er10 
1506880047684 Er10 
1506880030996 Er10 
1506879980929 Er10 
1506879977580 Er10 
1506879974250 Er10 
1506879970901 Er10 
1506879964241 Er10 
1506879954212 Er10 
1506879900817 Er10 

我想寫忽略相同的連續失誤(在這種情況下,一個SQL查詢, Er10)在時間戳的某個間隔(5分鐘)內。我怎樣才能做到這一點?使用自我Inner Join?我要的結果是這樣的:

timestmp log_error 
1507031197631 Er7 
1507031197621 Er8 
1507031197409 Er9 
1506888444602 Er10 /* The last one from this example, based on the difference in timestmp */ 
1506879900817 Er10 /* The first Er10 registry */ 
+1

。標記您正在使用的DBMS(MySQL,MS SQL Server,Oracle)。 –

+0

@YogeshSharma,已標記,謝謝。 –

+0

丟棄適當的樣本數據,以便timestmp列可以轉換爲datetime.or或使用datetime列拋出數據,然後只能找到5分鐘間隔的數據。 – KumarHarsh

回答

0

您可以lag(),累計總和,並group by做到這一點:

select log_error, min(timestamp), max(timestamp) 
from (select l.*, 
      sum(case when prev_le = log_error and 
          prev_timestamp > timestamp - "5 minutes" 
         then 0 else 1 
       end) over (order by timestamp) as grp 
     from (select l.*, 
        lag(log_error) over (order by timestmp) as prev_le, 
        lag(timestmp) over (order by timestmp) as prev_timestmp 
      from logs l 
      ) l 
    ) l 
group by grp, log_error; 

注:- "5 minutes"旨在什麼邏輯是該。據推測,這是5 * 605 * 60 * 1000

1

您可以使用row_number來創建連續的log_error值組。這種方法被稱爲「tabibitosan法」

select log_error, min(timestmp), max(timestmp) 
from (
    select t.*, 
     row_number() over (order by timestmp) 
     - row_number() over (partition by log_error order by timestmp) as grp 
    from your_table t 
    ) t 
group by log_error, grp; 

我承認結果的格式不完全是你怎麼想,但它有你需要的信息。