2013-01-23 237 views
2

我有兩個具有相同結構的表。
我該如何檢查這兩個行中的所有行是否相等?
即第一個表中的每一行存在於另一箇中,反之亦然。檢查兩個表是否相等

回答

0

這是一個有趣的。我不知道是否有一個更好的或者更簡單的方式來做到這一點,但這樣的事情可能工作:

假設你有兩個表,T1和T2,和他們每個人都有兩列,C1和C2

create view t1_counts 
as select c1, c2, count(*) as num 
from t1 
group by c1, c2; 

create view t2_counts 
as select c1, c2, count(*) as num 
from t2 
group by c1, c2; 

select t1_counts.c1, t1_counts.c2, t1_counts.num, t2_counts.num 
from t1_counts full outer join t2_counts on (t1_counts.c1 = t2_counts.c1 and t1_counts.c2 = t2_counts.c2) 
where t1_counts.num != t2_counts.num; 

如果兩個表相等,則輸出將爲空。

1

Jeff的博客解決方案與Hive相關:http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx

「其基本思想是:如果我們將所有列上的兩個表的聯合進行分組,那麼如果兩個表相同,則所有組都將導致2的COUNT(*)。但對於任何行在GROUP BY子句的任何一列上都沒有完全匹配,COUNT(*)將是1 - 這些都是我們想要的。我們還需要在UNION的每個部分添加一列以指示每行到哪個表從,否則沒有辦法區分哪一行來自哪個表。「

處理重複的改進方案被公佈爲註釋:http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx#3155 (再現代碼,因爲它是從註釋最初發布用戶「佩裏」)

SELECT MIN(TableName) as TableName, COL1, COL2, COL3 ... 
    FROM 
    (
    SELECT 'Table A' as TableName, COUNT(*) NDUPS, A.COL1, A.COL2, A.COL3, ... 
    FROM Table1 A GROUP BY ID, COL1, COL2, COL3 ... 
    UNION ALL 
    SELECT 'Table B' as TableName, COUNT(*) NDUPS, B.COL1, B.COl2, B.COL3, ... 
    FROM Table2 B 
    GROUP BY ID, COL1, COL2, COL3 ... 
    ) tmp 
    GROUP BY NDUPS, ID, COL1, COL2, COL3 ... 
    HAVING COUNT(*) = 1 
    ORDER BY ID 
+0

願您發表的總結爲每個鏈接?這樣,如果鏈接斷開,信息不會丟失。 – fxm