2016-04-23 80 views
0

我有一個表結構如下:SQL/Hive查詢查找所有用戶購買的第三個不同項目?

UserID itemName action 
---------------------- 

1   a   bought 

2   b   viewed 

3   c   bought 

1   b   bought 

2   c   bought 

1   c   bought 

3   b   viewed 

現在我想找到第三個(根據購買數量)不同項目由買(行動)的所有用戶。所以你可以幫助我解決這個問題。抱歉表格格式不正確。

+0

你如何定義項目的購買訂單購買了ITEMNAME?您是否想要確定每個用戶購買的所有商品中的「第三」,還是您想知道每個用戶是他們購買的「第三」商品? – collapsar

+0

您需要另一個包含時間戳的字段。這可能是[這個問題]的副本(http://stackoverflow.com/questions/400712/how-to-do-equivalent-of-limit-distinct) – 4castle

+0

@collapsar根據計數即第三最高計數每個人都買的物品 – user3396729

回答

1

我覺得你的描述中這樣的話聽起來很對。首先按數量選擇前3位,按項目分組並按數量降序排序。然後從該組中按數量升序排序選擇前1個。請記住,我不熟悉HiveSQL 100%,但這個SQL代碼應該是非常接近標準:

SELECT TOP 1 itemName 
FROM (
     SELECT TOP 3 itemName, COUNT(*) AS boughtCount 
     FROM MyTable 
     WHERE action = 'bought' 

     GROUP BY itemName 
     ORDER BY boughtCount DESC 
    ) 
ORDER BY boughtCount 

編輯:按照註釋中的精度:

編輯2:這是測試在MSSQL中工作,可能需要調整一些HiveSQL的語法。

SELECT TOP 1 itemId 
FROM (
     -- Get the list of the top 3 items that have as many ItemsByUsers entries as distinct userIds 
     -- in the table, group by item and sort by sum of items bought descending. 
     SELECT TOP 3 itemId, SUM(boughtCount) AS totalBought 
     FROM (
       -- Get a list of the most bought items by item and user 
       SELECT itemId, userId, COUNT(*) AS boughtCount 
       FROM MyTable 
       WHERE action = 'bought' 
       GROUP BY itemId, userId 
      ) AS ItemCountByUser 
     GROUP BY itemId 
     HAVING COUNT(*) = (SELECT COUNT(*) FROM (SELECT DISTINCT userId FROM MyTable) AS UserCount) 
     ORDER BY totalBought DESC 
    ) AS MostBought 
ORDER BY totalBought 
+0

您不需要訂購外部選擇,因爲它只包含一條記錄。 – 4castle

+0

但這並不能保證每個人都購買了它返回的物品。我想要每個用戶購買的物品。 – user3396729

+0

沒錯。主要表現爲OP的邏輯過程清晰度。 –

0

我的理解是,你想顯示已被任何用戶購買的itemNames 3次或更多次....?

SELECT a.itemName FROM 
    (SELECT 
     itemName AS itemName, 
     sum(action) AS action 
    FROM 
     (SELECT 
      a.itemName as itemName, 
      CASE 
       WHEN (action = 'bought') 
        THEN (1) 
       ELSE (0) 
      END AS action 
     FROM yourTableName) AS a 
    GROUP BY 
     itemName) AS a 
where action > 2; 

我還沒有測試了這一點...

請讓我知道這是不是你的解決方案,所以我可以探索其他選項..

0

請嘗試以下的查詢列出由所有用戶,並在第3最高位置

from ( select itemname,count(action) boughtcount from data a join select distinct userid as id from data where action='bought' b on a.userid=b.id where a.action='bought' group by name order by boughtcount desc limit 3) as t select t.itemname limit 1;

相關問題