嘗試將此代碼應用到你的任務 -
CREATE TABLE visits(
user_id INT(11) NOT NULL,
dt DATETIME DEFAULT NULL
);
INSERT INTO visits VALUES
(1, '2011-06-30 12:11:46'),
(1, '2011-07-01 13:16:34'),
(1, '2011-07-01 15:22:45'),
(1, '2011-07-01 22:35:00'),
(1, '2011-07-02 13:45:12'),
(1, '2011-08-01 00:11:45'),
(1, '2011-08-05 17:14:34'),
(1, '2011-08-05 18:11:46'),
(1, '2011-08-06 20:22:12'),
(2, '2011-08-30 16:13:34'),
(2, '2011-08-31 16:13:41');
SET @i = 0;
SET @last_dt = NULL;
SET @last_user = NULL;
SELECT v.user_id,
COUNT(DISTINCT(DATE(dt))) number_of_days,
MAX(days) number_of_visits
FROM
(SELECT user_id, dt
@i := IF(@last_user IS NULL OR @last_user <> user_id, 1, IF(@last_dt IS NULL OR (DATE(dt) - INTERVAL 1 DAY) > DATE(@last_dt), @i + 1, @i)) AS days,
@last_dt := DATE(dt),
@last_user := user_id
FROM
visits
ORDER BY
user_id, dt
) v
GROUP BY
v.user_id;
----------------
Output:
+---------+----------------+------------------+
| user_id | number_of_days | number_of_visits |
+---------+----------------+------------------+
| 1 | 6 | 3 |
| 2 | 2 | 1 |
+---------+----------------+------------------+
說明:
要了解它是如何工作的,讓我們檢查子查詢,在這兒呢。
SET @i = 0;
SET @last_dt = NULL;
SET @last_user = NULL;
SELECT user_id, dt,
@i := IF(@last_user IS NULL OR @last_user <> user_id, 1, IF(@last_dt IS NULL OR (DATE(dt) - INTERVAL 1 DAY) > DATE(@last_dt), @i + 1, @i)) AS
days,
@last_dt := DATE(dt) lt,
@last_user := user_id lu
FROM
visits
ORDER BY
user_id, dt;
正如您所看到的,查詢返回所有行並對訪問次數執行排名。這是基於變量的已知排名方法,請注意,行由用戶和日期字段排序。這個查詢計算用戶訪問,並輸出下一個數據集,其中days
列訪問的次數提供秩 -
+---------+---------------------+------+------------+----+
| user_id | dt | days | lt | lu |
+---------+---------------------+------+------------+----+
| 1 | 2011-06-30 12:11:46 | 1 | 2011-06-30 | 1 |
| 1 | 2011-07-01 13:16:34 | 1 | 2011-07-01 | 1 |
| 1 | 2011-07-01 15:22:45 | 1 | 2011-07-01 | 1 |
| 1 | 2011-07-01 22:35:00 | 1 | 2011-07-01 | 1 |
| 1 | 2011-07-02 13:45:12 | 1 | 2011-07-02 | 1 |
| 1 | 2011-08-01 00:11:45 | 2 | 2011-08-01 | 1 |
| 1 | 2011-08-05 17:14:34 | 3 | 2011-08-05 | 1 |
| 1 | 2011-08-05 18:11:46 | 3 | 2011-08-05 | 1 |
| 1 | 2011-08-06 20:22:12 | 3 | 2011-08-06 | 1 |
| 2 | 2011-08-30 16:13:34 | 1 | 2011-08-30 | 2 |
| 2 | 2011-08-31 16:13:41 | 1 | 2011-08-31 | 2 |
+---------+---------------------+------+------------+----+
然後我們組該數據由用戶設置和使用聚集函數: 「COUNT(DISTINCT(DATE( DT)))」 - 計算的天 數‘MAX(天)’ - 訪問次數,這是從我們的子查詢days
場的最大值。
這是所有)
Devart數據集的最後正確的結果......我似乎無法完全理解你的建議?是否有可能給一些更多的細節?謝謝!關於第二個問題,我的問題是正確的,只要你不計算用戶和城市,正如我的問題所述。 – linkyndy
對不起,我以爲,對於「多少天,用戶一直在一個城市」的結果應該像(USER_ID,COUNT_OF_DAYS)。 – Simon
謝謝你的細節。經過幾次調整以適合我的實際數據庫表,您的查詢就像一個魅力。再次感謝你! – linkyndy