2016-11-02 150 views
2

非常抱歉,如果這已被以某種方式回答。我已經檢查過了,無法弄清楚。比較兩組數據

我需要在postgresql中找到一種方法來比較一週到一週的數據。所有數據都存在於同一個表中,並且有一個星期編號列。數據並不總是完全重疊,但我需要在組內進行數據比較。

說這些是數據集:

Week 2 
+--------+--------+------+---------+-------+ 
| group | num | color| ID  | week #| 
+--------+--------+------+---------+-------+ 
| a | 1 | red | a1red | 2 | 
| a | 2 | blue | a2blue | 2 | 
| b | 3 | blue | b3blue | 2 | 
| c | 7 | black| c7black | 2 | 
| d | 8 | black| d8black | 2 | 
| d | 9 | red | d9red | 2 | 
| d | 10 | gray | d10gray | 2 | 
+--------+--------+------+---------+-------+ 

Week 3 
+--------+--------+------+---------+-------+ 
| group | num | color| ID  | week #| 
+--------+--------+------+---------+-------+ 
| a | 1 | red | a1red | 3 | 
| a | 2 | green| a2green | 3 | 
| b | 3 | blue | b3blue | 3 | 
| b | 5 | green| b5green | 3 | 
| c | 7 | black| c7black | 3 | 
| e | 11 | blue | d11blue | 3 | 
| e | 12 | other| d12other| 3 | 
| e | 14 | brown| d14brown| 3 | 
+--------+--------+------+---------+-------+ 

每一行都有做出來的基團,數量,和顏色值中的ID。

我需要查詢抓住所有組從第3周,然後在第3週中存在的2周的任何組:在A組已經改變,就像在集團內部

  1. 標誌ID的如果添加任何ID或刪除該組,比如在B組
  2. 標誌
  3. ,將是不錯的,但不是必需的

一個功能,將有3周比較反對1周在第2周不存在的組。

我曾考慮嘗試將兩週的時間分開並使用攔截/除了獲得結果,但我無法完全理解我如何才能使其正常工作。任何提示將不勝感激。

+1

凡隱藏在該表中的週數? –

+0

它只是另一列。更新了表格來澄清這一點。 – Jakewb89

回答

0

對於只有兩個(知)周,你可以做這樣的事情:

select coalesce(w1.group_nr, w2.group_nr) as group_nr, 
     coalesce(w1.num, w2.num) as num, 
     case 
     when w1.group_nr is null then 'missing in first week' 
     when w2.group_nr is null then 'missing in second week' 
     when (w1.color, w1.id) is distinct from (w2.color, w2.id) then 'data has changed' 
     else 'no change' 
     end as status, 
     case 
      when 
       w1.group_nr is not null 
      and w2.group_nr is not null 
      and w1.color is distinct from w2.color then 'color is different' 
     end as color_change, 
     case 
      when 
       w1.group_nr is not null 
      and w2.group_nr is not null 
      and w1.id is distinct from w2.id then 'id is different' 
     end as id_change 
from (
    select group_nr, num, color, id, hstore 
    from data 
    where week = 2 
) as w1 
    full outer join (
    select group_nr, num, color, id 
    from data 
    where week = 3 
) w2 on (w1.group_nr, w1.num) = (w2.group_nr, w2.num) 

獲取已更改的屬性是一個有點笨拙。如果你可以用文字表述過,你可以使用hstore擴展顯示的差異:

select coalesce(w1.group_nr, w2.group_nr) as group_nr, 
     coalesce(w1.num, w2.num) as num, 
     case 
     when w1.group_nr is null then 'missing in first week' 
     when w2.group_nr is null then 'missing in second week' 
     when (w1.color, w1.id) is distinct from (w2.color, w2.id) then 'data has changed' 
     else 'no change' 
     end as status, 
     w2.attributes - w1.attributes as changed_attributes 
from (
    select group_nr, num, color, id, hstore(data) - 'week'::text as attributes 
    from data 
    where week = 2 
) as w1 
    full outer join (
    select group_nr, num, color, id, hstore(data) - 'week'::text as attributes 
    from data 
    where week = 3 
) w2 on (w1.group_nr, w1.num) = (w2.group_nr, w2.num); 
+0

啊!非常感謝你。在你制定了基本的思維過程之後,所有的事情都被點擊到位,現在運行得很好。 – Jakewb89