2013-04-18 47 views
2

研磨我有以下數據表:結束在SQL

User#  App 
1  A 
1  B 
2  A 
2  B 
3  A 

我想知道由不同的用戶應用程序之間的重疊,與這個樣子的

App1 App2 DistinctUseroverlapped 
A  A  3 
A  B  2 
B  B  2 

所以我最終的結果那麼是什麼結果意味着有3個用戶僅使用應用程序A,有2個用戶使用App A和App B,並且有2個用戶僅使用App B.

請記住有很多的應用程序和用戶我怎麼能在SQL中做到這一點?

回答

2

我的解決方案首先生成所有感興趣的應用程序對。這是driver子查詢。

然後它加入每個應用程序的原始數據。

最後,它使用count(distinct)來計算兩個列表之間匹配的不同用戶。

select pairs.app1, pairs.app2, 
     COUNT(distinct case when tleft.user = tright.user then tleft.user end) as NumCommonUsers 
from (select t1.app as app1, t2.app as app2 
     from (select distinct app 
      from t 
      ) t1 cross join 
      (select distinct app 
      from t 
      ) t2 
     where t1.app <= t2.app 
    ) pairs left outer join 
    t tleft 
    on tleft.app = pairs.app1 left outer join 
    t tright 
    on tright.app = pairs.app2 
group by pairs.app1, pairs.app2 

到加入您可以移動在count的條件比較,並只使用count(distinct)

select pairs.app1, pairs.app2, 
     COUNT(distinct tleft.user) as NumCommonUsers 
from (select t1.app as app1, t2.app as app2 
     from (select distinct app 
      from t 
      ) t1 cross join 
      (select distinct app 
      from t 
      ) t2 
     where t1.app <= t2.app 
    ) pairs left outer join 
    t tleft 
    on tleft.app = pairs.app1 left outer join 
    t tright 
    on tright.app = pairs.app2 and 
     tright.user = tleft.user 
group by pairs.app1, pairs.app2 

我更喜歡第一種方法,因爲它是被算什麼更加明確。

這是標準的SQL,所以它應該在Vertica上工作。

+0

大戈登....我會嘗試,讓你知道。謝謝 – user1570210

0

這部作品在Vertica的6

with tab as 
    (select 1 as user,'A' as App 
    union select 1 as user,'B' as App 
    union select 2 as user,'A' as App 
    union select 2 as user,'B' as App 
    union select 3 as user,'A' as App 
    ) 
    , apps as 
    (select distinct App from tab) 
    select apps.app as APP1,tab.app as APP2 ,count(distinct tab.user) from tab,apps 
    where tab.app>=apps.app 
    group by 1,2 
    order by 1