2016-03-21 91 views
4

我有一個表類似如下結構:查詢其適用於某些情況下只有

City  start_date    end_date 
Paris  1995-01-01 00:00:00 1997-10-01 23:59:59 
Paris  1997-10-02 00:00:00 0001-01-01 00:00:00 
Paris  2013-01-25 00:00:00 0001-01-01 00:00:00 
Paris  2015-04-25 00:00:00 0001-01-01 00:00:00 
Berlin  2014-11-01 00:00:00 0001-01-01 00:00:00 
Berlin  2014-06-01 00:00:00 0001-01-01 00:00:00 
Berlin  2015-09-11 00:00:00 0001-01-01 00:00:00 
Berlin  2015-10-01 00:00:00 0001-01-01 00:00:00 
Milan  2001-01-01 00:00:00 0001-01-01 00:00:00 
Milan  2005-10-02 00:00:00 2006-10-02 23:59:59 
Milan  2006-10-03 00:00:00 2015-04-24 23:59:59 
Milan  2015-04-25 00:00:00 0001-01-01 00:00:00 

的數據包含基於城市開始和結束日期的歷史觀。城市的最新記錄應該是開始日期最高的記錄,並且結束日期爲「0001-01-01 00:00:00」,表示還沒有結束日期。

我需要清理這些數據,並確保每個城市的所有歷史記錄都結束日期的下一個記錄的開始日期前一秒,只在END_DATE設置爲「0001-01-0100箱子:00:00' 。所以在end_date有實際日期的情況下,它將被忽略。另外,具有最近的城市start_date的記錄不需要修改end_date。

結果表應該是這樣的:

City  start_date    end_date 
Paris  1995-01-01 00:00:00 1997-10-01 23:59:59 
Paris  1997-10-02 00:00:00 2013-01-24 23:59:59 
Paris  2013-01-25 00:00:00 2015-04-24 23:59:59 
Paris  2015-04-25 00:00:00 0001-01-01 00:00:00 
Berlin  2014-11-01 00:00:00 2014-05-31 23:59:59 
Berlin  2014-06-01 00:00:00 2015-09-10 23:59:59 
Berlin  2015-09-11 00:00:00 2015-09-30 23:59:59 
Berlin  2015-10-01 00:00:00 0001-01-01 23:59:59 
Milan  2001-01-01 00:00:00 2005-10-01 23:59:59 
Milan  2005-10-02 00:00:00 2006-10-02 23:59:59 
Milan  2006-10-03 00:00:00 2015-04-24 23:59:59 
Milan  2015-04-25 00:00:00 0001-01-01 00:00:00 

我試圖在this question由用戶提出了以下腳本。

update test join 
     (select t.*, 
       (select min(start_date) 
       from test t2 
       where t2.city = t.city and 
         t2.start_date > t.start_date 
       order by t2.start_date 
       limit 1 
       ) as next_start_date 
     from test t 
     ) tt 
     on tt.city = test.city and tt.start_date = test.start_date 
    set test.end_date = date_sub(tt.next_start_date, interval 1 second) 
where test.end_date = '0001-01-01' and 
     next_start_date is not null; 

不幸的是,從柏林記錄開始,一些end_dates並非如預期的那樣(例如id號5和6)。但其他人正在出現,因爲他們應該。這是如下圖所示:

enter image description here

下面是創建和插入語句能夠複製:

CREATE TABLE `test` (
    `id` int(11) NOT NULL AUTO_INCREMENT, 
    `city` varchar(50) DEFAULT NULL, 
    `start_date` datetime DEFAULT NULL, 
    `end_date` datetime DEFAULT NULL, 
    PRIMARY KEY (`id`) 
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8; 

INSERT INTO test (city, start_date, end_date) VALUES ('Paris','1995-01-01 00:00:00','1997-10-01 23:59:59'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','1997-10-02 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','2013-01-25 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','2015-04-25 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2014-11-01 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2014-06-01 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2015-09-11 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2015-10-01 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2001-01-01 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2005-10-02 00:00:00','2006-10-02 23:59:59'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2006-10-03 00:00:00','2015-04-24 23:59:59'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2015-04-25 00:00:00','0001-01-01 00:00:00'); 
+0

你'UPDATE'聲明的工作確定與您提供的樣本數據。請檢查[this](http://sqlfiddle.com/#!9/d879f/2)演示。 –

回答

-1
-- query wanted 
UPDATE test t1 INNER JOIN 
    (SELECT *, @id := @id + 1 AS new_id 
    FROM test CROSS JOIN (SELECT @id := 0) param 
    ORDER BY city, start_date) t2 
    ON t1.city = t2.city AND t1.start_date = t2.start_date 
    INNER JOIN 
    (SELECT *, @id2 := @id2 + 1 AS new_id 
    FROM test CROSS JOIN (SELECT @id2 := 0) param 
    ORDER BY city, start_date) t3 
    ON t2.new_id + 1 = t3.new_id AND t2.city = t3.city 
SET t1.end_date = DATE_SUB(t3.start_date, INTERVAL 1 SECOND) 
WHERE t1.end_date = '0001-01-01 00:00:00'; 

下面是一個完整的演示。

SQL:

-- data 
CREATE TABLE `test` (
    `id` int(11) NOT NULL AUTO_INCREMENT, 
    `city` varchar(50) DEFAULT NULL, 
    `start_date` datetime DEFAULT NULL, 
    `end_date` datetime DEFAULT NULL, 
    PRIMARY KEY (`id`) 
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8; 

INSERT INTO test (city, start_date, end_date) VALUES ('Paris','1995-01-01 00:00:00','1997-10-01 23:59:59'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','1997-10-02 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','2013-01-25 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Paris','2015-04-25 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2014-11-01 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2014-06-01 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2015-09-11 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Berlin','2015-10-01 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2001-01-01 00:00:00','0001-01-01 00:00:00'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2005-10-02 00:00:00','2006-10-02 23:59:59'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2006-10-03 00:00:00','2015-04-24 23:59:59'); 
INSERT INTO test (city, start_date, end_date) VALUES ('Milan','2015-04-25 00:00:00','0001-01-01 00:00:00'); 
select * from test; 

-- query wanted 
UPDATE test t1 INNER JOIN 
    (SELECT *, @id := @id + 1 AS new_id 
    FROM test CROSS JOIN (SELECT @id := 0) param 
    ORDER BY city, start_date) t2 
    ON t1.city = t2.city AND t1.start_date = t2.start_date 
    INNER JOIN 
    (SELECT *, @id2 := @id2 + 1 AS new_id 
    FROM test CROSS JOIN (SELECT @id2 := 0) param 
    ORDER BY city, start_date) t3 
    ON t2.new_id + 1 = t3.new_id AND t2.city = t3.city 
SET t1.end_date = DATE_SUB(t3.start_date, INTERVAL 1 SECOND) 
WHERE t1.end_date = '0001-01-01 00:00:00'; 

select * from test; 

輸出:

mysql> -- query wanted 
mysql> UPDATE test t1 INNER JOIN 
    -> (SELECT *, @id := @id + 1 AS new_id 
    -> FROM test CROSS JOIN (SELECT @id := 0) param 
    -> ORDER BY city, start_date) t2 
    -> ON t1.city = t2.city AND t1.start_date = t2.start_date 
    -> INNER JOIN 
    -> (SELECT *, @id2 := @id2 + 1 AS new_id 
    -> FROM test CROSS JOIN (SELECT @id2 := 0) param 
    -> ORDER BY city, start_date) t3 
    -> ON t2.new_id + 1 = t3.new_id AND t2.city = t3.city 
    -> SET t1.end_date = DATE_SUB(t3.start_date, INTERVAL 1 SECOND) 
    -> WHERE t1.end_date = '0001-01-01 00:00:00'; 
rom tesQuery OK, 6 rows affected (0.00 sec) 
Rows matched: 6 Changed: 6 Warnings: 0 

mysql> select * from test; 
+----+--------+---------------------+---------------------+ 
| id | city | start_date   | end_date   | 
+----+--------+---------------------+---------------------+ 
| 13 | Paris | 1995-01-01 00:00:00 | 1997-10-01 23:59:59 | 
| 14 | Paris | 1997-10-02 00:00:00 | 2013-01-24 23:59:59 | 
| 15 | Paris | 2013-01-25 00:00:00 | 2015-04-24 23:59:59 | 
| 16 | Paris | 2015-04-25 00:00:00 | 0001-01-01 00:00:00 | 
| 17 | Berlin | 2014-11-01 00:00:00 | 2014-05-31 23:59:59 | 
| 18 | Berlin | 2014-06-01 00:00:00 | 2015-09-10 23:59:59 | 
| 19 | Berlin | 2015-09-11 00:00:00 | 2015-09-30 23:59:59 | 
| 20 | Berlin | 2015-10-01 00:00:00 | 0001-01-01 00:00:00 | 
| 21 | Milan | 2001-01-01 00:00:00 | 2005-10-01 23:59:59 | 
| 22 | Milan | 2005-10-02 00:00:00 | 2006-10-02 23:59:59 | 
| 23 | Milan | 2006-10-03 00:00:00 | 2015-04-24 23:59:59 | 
| 24 | Milan | 2015-04-25 00:00:00 | 0001-01-01 00:00:00 | 
+----+--------+---------------------+---------------------+ 
12 rows in set (0.00 sec) 
+0

此查詢是否假定每個城市的id值都是連續的?正如我注意到'id + 1'加入條件。如果是這樣,不幸的是,它不適用於所有情況,因爲每個城市的記錄可能不會完全相互插入。 –

+0

是否需要爲每條記錄手動填寫參數? –

+0

所以@id值不需要填寫? –

相關問題