2017-05-08 119 views
1

我有一個Big Query中的表,其中包含唯一的ID,時間戳和距離,並且希望通過ID和最新的時間戳選擇一條記錄。使用最新的時間戳選擇每個唯一的ID

E.g.表看起來像

ID|timestamp|distance 
A|100|2 
A|90|3 
B|110|5 
D|100|4 
A|80|2 
B|10|2 

查詢應返回類似:

A|100|2 
B|110|5 
D|100|4 

PostgreSQL中工作的查詢看起來是這樣,但沒有「明顯ON」 BigQuery中的?

SELECT * FROM (
SELECT DISTINCT ON (ID) 
id, timestamp, distance 
FROM ranking 
ORDER BY ID, timestamp DESC 
) AS latest_dtg 
ORDER BY distance 

回答

0

這個呢?

SELECT a.* 
FROM yourtable AS a 
INNER JOIN (
SELECT id, MAX(timestamp) AS newesttimestamp 
FROM yourtable 
GROUP BY id 
) AS b 
ON a.id = b.id AND a.timestamp = b.newesttimestamp 
ORDER BY a.id 
1

這裏有一個想法:

#standardSQL 
WITH ranking AS 
(SELECT 'A' id, 100 ts, 2 distance UNION ALL 
SELECT 'A', 90, 3 UNION ALL 
SELECT 'B', 110, 5 UNION ALL 
SELECT 'D', 100, 4 UNION ALL 
SELECT 'B', 10, 2 UNION ALL 
SELECT 'A', 80, 2) 
SELECT id, ARRAY_AGG(STRUCT(ts, distance) ORDER BY ts DESC LIMIT 1)[SAFE_OFFSET(0)] 
FROM ranking 
GROUP BY id 
2

下面是BigQuery的標準SQL

#standardSQL 
SELECT row.* FROM (
    SELECT ARRAY_AGG(r ORDER BY timestamp DESC LIMIT 1)[OFFSET(0)] AS row 
    FROM ranking AS r 
    GROUP BY id 
) 

你可以從你的問題與播放/測試下方的虛擬數據

#standardSQL 
WITH ranking AS (
    SELECT 'A' AS id, 100 AS timestamp, 2 AS distance UNION ALL 
    SELECT 'A', 90, 3 UNION ALL 
    SELECT 'B', 110, 5 UNION ALL 
    SELECT 'D', 100, 4 UNION ALL 
    SELECT 'B', 10, 2 UNION ALL 
    SELECT 'A', 80, 2 
) 
SELECT row.* FROM (
    SELECT ARRAY_AGG(r ORDER BY timestamp DESC LIMIT 1)[OFFSET(0)] AS row 
    FROM ranking AS r 
    GROUP BY id 
) 
相關問題