2014-04-14 135 views
0

mysql時間戳查詢,給定分鐘選擇除第一條記錄以外的所有記錄

我有一張表,用於存儲有關我的IP攝像機拍攝的快照的數據。通常,相機每分鐘拍攝多張快照,但我的相機之一配置爲每分鐘只拍攝一張快照。

我清除項目從表(和磁盤),但保留根據下列規則:

  1. 對於第7天,所有圖像都保持
  2. 任何超過7天前,只是不停每一天
  3. 任何超過4周時第一個快照,只是不停地在06th,12日第一快照和18小時全天
  4. 任何超過3個月的,只要保持一個快照在12小時的一天。

以下是我當前的查詢,它工作正常,但它保留了在任何一小時的第一分鐘內拍攝的所有快照。

SELECT camera_id, 
     timestamp, 
     frame, 
     filename 
FROM snapshot_frame 
WHERE ((timestamp < subdate(now(), INTERVAL 7 DAY) 
     AND minute(timestamp) != 0) 
     OR (timestamp < subdate(now(), INTERVAL 4 WEEK) 
      AND (hour(timestamp) NOT IN (6, 
             12, 
             18) 
       OR minute(timestamp) != 0)) 
     OR (timestamp < subdate(now(), INTERVAL 3 MONTH) 
      AND (hour(timestamp) != 12 
       OR minute(timestamp) != 0))) 

根據上述規則,我該如何保留7天以上的每分鐘第一張快照?

萬一有幫助,表/索引結構:

mysql> describe snapshot_frame; 
+-----------+--------------+------+-----+---------+-------+ 
| Field  | Type   | Null | Key | Default | Extra | 
+-----------+--------------+------+-----+---------+-------+ 
| camera_id | int(11)  | NO |  | NULL |  | 
| timestamp | datetime  | NO | MUL | NULL |  | 
| frame  | int(11)  | YES |  | NULL |  | 
| filename | varchar(100) | YES | UNI | NULL |  | 
+-----------+--------------+------+-----+---------+-------+ 
4 rows in set (0.04 sec) 

mysql> show index from snapshot_frame; 
+----------------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
| Table   | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | 
+----------------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
| snapshot_frame |   0 | filename  |   1 | filename | A   |  3052545 |  NULL | NULL | YES | BTREE  |   |    | 
| snapshot_frame |   1 | idx_time_camera |   1 | timestamp | A   |  3052545 |  NULL | NULL |  | BTREE  |   |    | 
| snapshot_frame |   1 | idx_time_camera |   2 | camera_id | A   |  3052545 |  NULL | NULL |  | BTREE  |   |    | 
+----------------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
3 rows in set (0.42 sec) 

mysql> select count(*) from snapshot_frame; 
+----------+ 
| count(*) | 
+----------+ 
| 3030214 | 
+----------+ 
1 row in set (18.47 sec) 

更新:所以,我已經成功地創建一個查詢,提供了所有,我要保留快照的,按照我的規則:

SELECT camera_id, 
     TIMESTAMP, 
     frame, 
     filename 
FROM snapshot_frame 
WHERE TIMESTAMP >= subdate(now(), INTERVAL 7 DAY) 
UNION 
    (SELECT camera_id, 
      TIMESTAMP, 
      frame, 
      filename 
    FROM snapshot_frame 
    WHERE TIMESTAMP < subdate(now(), INTERVAL 7 DAY) 
    AND TIMESTAMP >= subdate(now(), INTERVAL 4 WEEK) 
    AND minute(TIMESTAMP) = 0 
    GROUP BY camera_id, 
      year(TIMESTAMP), 
      month(TIMESTAMP), 
      date(TIMESTAMP), 
      hour(TIMESTAMP), 
      minute(TIMESTAMP)) 
UNION 
    (SELECT camera_id, 
      TIMESTAMP, 
      frame, 
      filename 
    FROM snapshot_frame 
    WHERE TIMESTAMP < subdate(now(), INTERVAL 4 WEEK) 
    AND TIMESTAMP >= subdate(now(), INTERVAL 3 MONTH) 
    AND hour(TIMESTAMP) IN (6, 
          12, 
          18) 
    AND minute(TIMESTAMP) = 0 
    GROUP BY camera_id, 
      year(TIMESTAMP), 
      month(TIMESTAMP), 
      date(TIMESTAMP), 
      hour(TIMESTAMP), 
      minute(TIMESTAMP)) 
UNION 
    (SELECT camera_id, 
      TIMESTAMP, 
      frame, 
      filename 
    FROM snapshot_frame 
    WHERE TIMESTAMP < subdate(now(), INTERVAL 3 MONTH) 
    AND hour(TIMESTAMP) = 12 
    AND minute(TIMESTAMP) = 0 
    GROUP BY camera_id, 
      year(TIMESTAMP), 
      month(TIMESTAMP), 
      date(TIMESTAMP), 
      hour(TIMESTAMP), 
      minute(TIMESTAMP)) 

我只是試圖找出如何扭轉現在,所以我返回一個包含從snapshot_frame不在上述查詢所有行的結果集。

任何指針?

+0

我有一個解決方案,但沒有一個我欣喜若狂。基本上,我使用上面的查詢來創建我希望保留的行的臨時表,然後對所有快照運行查詢以確定臨時表中哪些不存在。我會更新我原來的問題,因爲我不能回答我自己的問題。 – WhyTey

+0

哦,男孩。這是很多東西。 – Strawberry

+0

你是說,在一個大的查詢?或許多信息?無論哪種方式,我會採取任何建議,如何更有效;) – WhyTey

回答

0

,我現在使用的解決方案是創建一個臨時表,我想保留行:

CREATE 
TEMPORARY TABLE IF NOT EXISTS retain_frames (INDEX idx_time_camera (timestamp, camera_id))AS 
SELECT camera_id, 
     timestamp, 
     frame, 
     filename 
FROM (
     (SELECT camera_id, 
       timestamp, 
       frame, 
       filename 
     FROM snapshot_frame a 
     WHERE timestamp >= subdate(now(), INTERVAL 7 DAY)) 
     UNION 
     (SELECT camera_id, 
       timestamp, 
       frame, 
       filename 
     FROM snapshot_frame b 
     WHERE timestamp < subdate(now(), INTERVAL 7 DAY) 
      AND timestamp >= subdate(now(), INTERVAL 4 WEEK) 
      AND minute(timestamp) = 0 
     GROUP BY camera_id, 
        date(timestamp), 
        hour(timestamp), 
        minute(timestamp)) 
     UNION 
     (SELECT camera_id, 
       timestamp, 
       frame, 
       filename 
     FROM snapshot_frame c 
     WHERE timestamp < subdate(now(), INTERVAL 4 WEEK) 
      AND timestamp >= subdate(now(), INTERVAL 3 MONTH) 
      AND hour(timestamp) IN (6, 
            12, 
            18) 
      AND minute(timestamp) = 0 
     GROUP BY camera_id, 
        date(timestamp), 
        hour(timestamp), 
        minute(timestamp)) 
     UNION 
     (SELECT camera_id, 
       timestamp, 
       frame, 
       filename 
     FROM snapshot_frame d 
     WHERE timestamp < subdate(now(), INTERVAL 3 MONTH) 
      AND hour(timestamp) = 12 
      AND minute(timestamp) = 0 
     GROUP BY camera_id, 
        date(timestamp), 
        hour(timestamp), 
        minute(timestamp))) e 

,然後選擇無效的快照使用以下查詢:

SELECT camera_id, 
     timestamp, 
     frame, 
     filename 
FROM snapshot_frame a 
WHERE NOT EXISTS 
    (SELECT camera_id, 
      timestamp, 
      frame, 
      filename 
    FROM retain_frames b 
    WHERE a.camera_id = b.camera_id 
     AND a.timestamp = b.timestamp 
     AND a.frame = b.frame) 

唯一的問題是臨時表的創建大約需要2分鐘,似乎鎖定了數據庫,導致偶發的OperationalError: (1205, 'Lock wait timeout exceeded; try restarting transaction')被另一個試圖插入同一個表的線程拋入我的Python代碼中。

+0

你已經接受這個,所以不太可能有任何後續 – Strawberry

+0

我只是不接受,以防任何人有任何他們可能想添加。謝謝你的提示! – WhyTey

+0

在沒有任何聚合函數的情況下,除了GROUP BY沒有意義,它很可能返回您期望的結果。其核心問題是GROUPWISE-MAX的問題。這是一個常見問題解答,以至於手冊使用整個頁面。 – Strawberry

相關問題