2010-07-15 55 views
3

查詢:MySQL的 - 組類似的時間戳

SELECT project_id, 
     COUNT(*) AS count, 
     MIN(date_added) AS date_start, 
     MAX(date_added) AS date_end 
    FROM my_table 
GROUP BY project_id, TIMESTAMPDIFF(MINUTE, date_added) < 5 
WHERE user_id = 1 LIMIT 10 

我怎樣才能做到這一點?我想對這些項目進行分組,以便組中的兩個連續項目間隔不超過5分鐘,但開始和結束時間可以是任意距離。有沒有辦法在數據庫中做到這一點,或者我是否需要獲取所有數據並在程序中找出它?

+0

一個有趣的問題 - 我會期待看到這個答案。 – 2010-07-15 18:07:09

回答

3

好了,這裏有雲:

SELECT id, project_id, start_time, MAX(end_time) AS end_time FROM (
    SELECT 
    @new_group := 
     ((TIME_TO_SEC(date_added) - @prev_second) > (5 * 60)) || 
     (project_id <> @prev_project_id) AS new_group, 
    @date_added_group := @date_added_group + @new_group AS date_added_group, 
    @start_time := IF(@new_group, date_added, @start_time) AS start_time, 
    id, 
    project_id, 
    date_added AS end_time, 
    @prev_second := TIME_TO_SEC(date_added) AS prev_sec, 
    @prev_project_id := project_id AS prev_project 
    FROM my_table, 
    (SELECT 
    @new_group :=0, 
    @date_added_group := 0, 
    @start_time := 0, 
    @prev_second := 0, 
    @prev_project_id := 0) AS vars 
    ORDER BY project_id, date_added 
) AS my_table GROUP BY project_id, date_added_group; 

鑑於這樣的數據:

+----+------------+---------------------+ 
| id | project_id | date_added   | 
+----+------------+---------------------+ 
| 1 |   1 | 2010-07-15 19:00:00 | < new project 
| 2 |   1 | 2010-07-15 19:01:00 | 
| 3 |   1 | 2010-07-15 19:02:00 | 
| 4 |   2 | 2010-07-15 19:03:00 | < new project 
| 5 |   2 | 2010-07-15 19:04:00 | 
| 6 |   2 | 2010-07-15 19:25:00 | < new interval 
| 7 |   2 | 2010-07-15 19:26:00 | 
| 8 |   2 | 2010-07-15 19:27:00 | 
| 9 |   2 | 2010-07-15 19:48:00 | < new interval 
| 10 |   2 | 2010-07-15 19:49:00 | 
| 11 |   3 | 2010-07-15 19:50:00 | 
| 12 |   3 | 2010-07-15 20:11:00 | < new interval 
| 13 |   4 | 2010-07-15 20:12:00 | < new project 
| 14 |   4 | 2010-07-15 20:13:00 | 
| 15 |   4 | 2010-07-15 20:14:00 | 
| 16 |   5 | 2010-07-15 20:15:00 | < new project 
| 17 |   5 | 2010-07-15 20:16:00 | 
| 18 |   5 | 2010-07-15 21:27:00 | < new interval 
| 19 |   5 | 2010-07-15 21:28:00 | 
| 20 |   6 | 2010-07-15 21:29:00 | < new project 
| 21 |   6 | 2010-07-15 21:30:00 | 
| 22 |   6 | 2010-07-15 21:31:00 | 
+----+------------+---------------------+ 

查詢返回這個結果集:

+----+------------+---------------------+---------------------+ 
| id | project_id | start_time   | end_time   | 
+----+------------+---------------------+---------------------+ 
| 1 |   1 | 2010-07-15 19:00:00 | 2010-07-15 19:02:00 | 
| 4 |   2 | 2010-07-15 19:03:00 | 2010-07-15 19:04:00 | 
| 6 |   2 | 2010-07-15 19:25:00 | 2010-07-15 19:27:00 | 
| 9 |   2 | 2010-07-15 19:48:00 | 2010-07-15 19:49:00 | 
| 11 |   3 | 2010-07-15 19:50:00 | 2010-07-15 19:50:00 | 
| 12 |   3 | 2010-07-15 20:11:00 | 2010-07-15 20:11:00 | 
| 13 |   4 | 2010-07-15 20:12:00 | 2010-07-15 20:14:00 | 
| 16 |   5 | 2010-07-15 20:15:00 | 2010-07-15 20:16:00 | 
| 18 |   5 | 2010-07-15 21:27:00 | 2010-07-15 21:28:00 | 
| 20 |   6 | 2010-07-15 21:29:00 | 2010-07-15 21:31:00 | 
+----+------------+---------------------+---------------------+ 
0

既然你想鏈接在一起的最長的記錄鏈,每對的距離不到5分鐘,我認爲你可以用GROUP BY來做到這一點。

嘗試使用WHILE loop編寫存儲過程。從選擇記錄ORDER BY date_added的光標開始。您可以創建一個TEMPORARY TABLE並在每組的最後插入一行。通過將其作爲存儲過程來執行,您可以避免將所有記錄回收到程序中(通常通過網絡),這可以使其更快。

2

關閉我的頭頂,我沒有嘗試:

SELECT project_id, 
     COUNT(*) AS count, 
     MIN(date_added) AS date_start, 
     MAX(date_added) AS date_end 
    FROM my_table 
    WHERE user_id = 1 
GROUP BY project_id, ROUND(date_added/(5 * 60)) 
    LIMIT 10 

假設「dated_added」在課程秒。

換句話說,項目根據它們所屬的5分鐘片段進行分組。