查詢獲得最接近的時間戳值的記錄，以獲得兩列的唯一組合

+-------+----------------------+----------+------------------+ 
| isbn | book_container_id | shelf_id | update_time | 
+-------+----------------------+----------+------------------+ 
| 555 |     6 | shelf100 | 11/15/2015 19:10 | 
| 123 |     1 | shelf1 | 11/28/2015 8:00 | 
| 555 |     4 | shelf5 | 11/28/2015 9:10 | 
| 212 |     2 | shelf2 | 11/29/2015 8:10 | 
| 555 |     6 | shelf9 | 11/30/2015 22:10 | 
| 321 |     8 | shelf7 | 11/30/2015 8:10 | 
| 555 |     4 | shelf33 | 12/1/2015 7:00 | 
+-------+----------------------+----------+------------------+

假設我有一個類似上述的表（PostgreSQL）bookshelf_configuration。如果我獲得ISBN和時間戳，我希望能夠找到isbn和book_container_id的每個唯一組合的最接近（僅限於之前）的記錄。查詢獲得最接近的時間戳值的記錄，以獲得兩列的唯一組合

所以，如果我在看isbn「555」，與'12 /二千零十五分之一7:00' 的時間戳，我該回去：

+-------+----------------------+----------+------------------+ 
| isbn | book_container_id | shelf_id | update_time | 
+-------+----------------------+----------+------------------+ 
| 555 |     6 | shelf9 | 11/30/2015 22:10 | 
| 555 |     4 | shelf33 | 12/1/2015 7:00 | 
+-------+----------------------+----------+------------------+

我的SQL的知識是非常基本的。我有一個查詢可以工作，如果我只需要考慮isbn，但我需要一些幫助瞭解如何對組合(isbn, book_container_id)執行此操作。

來源

2015-12-02 ryoaska

一個典型的用例DISTINCT ON：

SELECT DISTINCT ON (book_container_id) 
     isbn, book_container_id, shelf_id, update_time 
FROM bookshelf_configuration 
WHERE isbn = 555 
AND update_time <= '2015-12-01 07:00' -- ISO 8601 format 
ORDER BY book_container_id, update_time DESC;

假設update_time定義NOT NULL，或者你必須添加NULLS LAST。詳細說明：

Select first row in each GROUP BY group?

根據基數和值頻率可能會有更快的查詢方式：

Optimize GROUP BY query to retrieve latest record per user

無論哪種方式，multicolumn index上(isbn, book_container_id, update_time DESC)是做這個fas的關鍵t用於非平凡大小的表格。排序順序應該與查詢匹配（或者是完整的反轉）。如果您將NULLS LAST添加到查詢中，也將其添加到索引中。

另外：對所有的日期/時間常量使用ISO 8601格式更好，因爲這對任何語言環境或日期樣式設置都是明確的。相關閱讀：

PostgreSQL: between with datetime

來源

2015-12-03 05:38:30

我試過了;不幸的是，它看起來像Redshift不支持DISTINCT ON（我應該提到這一點，但我沒有意識到它會有所作爲）。我使用了JamieD77解決方案的一個版本，但是我選擇了這個答案，因爲有完整的解釋，並且鏈接了關於這兩個解決方案的信息等。感謝他們幫助我理解事情的更好的鏈接！回覆：ISO 8601的建議，我們確實使用日期字符串的ISO 8601格式 - 我最初在excel中寫了我的示例表，所以我認爲他們在那裏搞砸了（或者只是用戶錯誤...） – ryoaska

有一種東西叫Row_Number，可以幫助你在這裏。

Select * 
From (
    Select *, 
      row_number() OVER (partition by isbn, book_container_id order by update_time desc) rn 
    From bookshelf_configuration 
    Where isbn = 555 and update_time <= '12/1/2015 7:00' 
) q 
Where q.rn = 1

來源

2015-12-02 20:22:17 JamieD77

謝謝！這對我有用 - 但我覺得我應該選擇其他答案，因爲鏈接包含更多信息並與其他方法進行比較等。我確實想要感謝你thoguh-這讓我暢通無阻，並能夠向前邁進:) – ryoaska

查詢獲得最接近的時間戳值的記錄，以獲得兩列的唯一組合

回答

相關問題