排除Redshift中後面的重複記錄

我有一個簡單的SQL問題，我無法解決（我正在使用Amazon Redshift）。排除Redshift中後面的重複記錄

比方說，我有下面的例子：

id, type, channel, date, column1, column2, column3, column4 
1, visit, seo, 07/08/2017: 11:11:22 
1, hit, seo, 07/08/2017: 11:12:34 
1, hit, seo, 07/08/2017: 11:13:22 
1, visit, sem, 07/08/2017: 11:15:11 
1, scarf, display, 07/08/2017: 11:15:45 
1, hit, display, 07/08/2017: 11:15:37 
1, hit, seo, 07/08/2017: 11:18:22 
1, hit, display 07/08/2017: 11:18:23 
1, hit, referal 07/08/2017: 11:19:55

我想選擇的所有訪問（這在我的邏輯表對應於與特定ID每一行的開始，並排除「通道」重複，經過對方來的人，我的例子應該返回：

1, visit, seo, 07/08/2017: 11:11:22 
**1, hit, seo, 07/08/2017: 11:12:34** (exclude because it follows seo and it's not a visit) 
**1, hit, seo, 07/08/2017: 11:13:22** (exclude because it follows seo and it's not a visit) 
1, visit, sem, 07/08/2017: 11:15:11 (include, new channel) 
1, scarf, display, 07/08/2017: 11:15:45 (include, new channel) 
**1, hit, display, 07/08/2017: 11:15:37** (exclude because it follows display and it's not a visit) 
1, hit, seo, 07/08/2017: 11:18:22 (include because it doesn't follow seo directly, even if seo is already present) 
1, hit, display 07/08/2017: 11:18:23 ((include because it doesn't follow display directly, even if display is already present) 
1, hit, referal 07/08/2017: 11:19:55 (include, new channel)

我用行號嘗試（因爲我紅移工作）：

select type, date, id, ROW_NUMBER() OVER (PARTITION BY id, channel ORDER BY date) as rn

，然後添加一個過濾器：

Where type='visit' or rn=1

但是，這並不能解決問題，因爲它不會返回的第7和第8行：

1, hit, seo, 07/08/2017: 11:18:22 (will be rn=4 for 'id=1, channel=seo' combination) 
1, hit, display 07/08/2017: 11:18:23 (will be rn=3 for 'id=1, channel=display' combination)

誰能給我請指示等等我能解決問題嗎？

來源

2017-08-07 Amine

您可以使用lag，只選擇行，其中前面的路是不同的類型或者是訪問

select * from (
    select * , 
     lag(channel) over (partition by id, order by date) prev_channel 
    from mytable 
) t where prev_channel <> channel or type = 'visit' or prev_channel is null

來源

2017-08-07 22:35:37 FuzzyTree

@FuzzyTree您好，我不知道這個窗口的功能，但它清楚地解決我的問題。非常感謝（y） – Amine

排除Redshift中後面的重複記錄

回答

相關問題