2016-12-06 36 views
1

我有查詢結果如下:總和加入重複行(SQL)時

SELECT ... ON CIA_factbook_dataset.my_name = World_Bank_dataset.my_name ... 

+----------------+------+-------------+-----------------+---------+--------+ 
| my_name  | Year | CIA_name | World_Bank_name | CIA_GDP | WB_GDP | 
+----------------+------+-------------+-----------------+---------+--------+ 
| United Kingdom | 2010 | UK   | United Kingdom | 2850 | 2800 | 
| United Kingdom | 2010 | UK   | Channel Islands | 2850 | 11 | 
| Cyprus   | 2010 | CYPRUS TURK | CYPRUS TURK  | 22 | 22 | 
| Cyprus   | 2010 | CYPRUS TURK | CYPRUS GRK  | 22 | 33 | 
| Cyprus   | 2010 | CYPRUS GRK | CYPRUS TURK  | 33 | 22 | 
| Cyprus   | 2010 | CYPRUS GRK | CYPRUS GRK  | 33 | 33 | 
+----------------+------+-------------+-----------------+---------+--------+ 

我需要計算的分國別數據的總和,但如果我只用GROUP BY my_name,year,它計算幾次相同數目的總和。

最終的結果應該是:

+----------------+------+---------+--------+ 
| my_name  | Year | CIA_GDP | WB_GDP | 
+----------------+------+---------+--------+ 
| United Kingdom | 2010 | 2850 | 2811 | 
| Cyprus   | 2010 | 55 | 55 | 
+----------------+------+---------+--------+ 

相反的:

+----------------+------+---------+--------+ 
| my_name  | Year | CIA_GDP | WB_GDP | 
+----------------+------+---------+--------+ 
| United Kingdom | 2010 | 5700 | 2811 | 
| Cyprus   | 2010 | 110 | 110 | 
+----------------+------+---------+--------+ 

如何實現這一目標?
比使用SUM(distinct CIA_GDP),SUM(distinct WB_GDP)更好的方法嗎?
(理論上,土耳其塞浦路斯和希臘塞浦路斯的GDP可能是相同的)

+0

等待,爲什麼是'英國(CIA_GDP)'的'MAX()'和'塞浦路斯(CIA_GDP)'的'SUM( )'在你想要的輸出? – Blag

+0

誰在談論MAX?! –

+0

好吧,我知道了,我的壞,2min – Blag

回答

1

對於這個我假設my_nameYear是兩個表中是唯一的。

SQL Fiddle

SELECT t1.my_name, t1.YEAR, SUM_CIA_GDP, SUM_WB_GDP 
FROM (
    SELECT DISTINCT my_name, YEAR, SUM(CIA_GDP) AS SUM_CIA_GDP 
    FROM t 
    GROUP BY my_name, YEAR, WB_GDP 
    ) t1 
JOIN ( 
    SELECT DISTINCT my_name, YEAR, SUM(WB_GDP) AS SUM_WB_GDP 
    FROM t 
    GROUP BY my_name, YEAR, CIA_GDP 
    ) t2 
    ON t1.my_name = t2.my_name 
     AND t1.YEAR = t2.YEAR 

Results

|  my_name | YEAR | SUM_CIA_GDP | SUM_WB_GDP | 
|----------------|------|-------------|------------| 
|   Cyprus | 2010 |   55 |   55 | 
| United Kingdom | 2010 |  2850 |  2811 | 
+1

'SUM'和'(';&ERROR:列'my_name'在字段列表中含糊不清' – Blag

+0

優秀!只是'GROUP BY'不應該包含'CIA_name/World_Bank_name' –

+2

順便說一句,儘可能避免在子查詢上加入,這對數據庫引擎來說效率不高,如果你有更多的行,它會開始變慢 – Blag

2

SQL Fiddle

的MySQL 5.6架構設置

CREATE TABLE t 
    (`my_name` varchar(14), `Year` int, `CIA_name` varchar(11), `World_Bank_name` varchar(15), `CIA_GDP` int, `WB_GDP` int) 
; 

INSERT INTO t 
    (`my_name`, `Year`, `CIA_name`, `World_Bank_name`, `CIA_GDP`, `WB_GDP`) 
VALUES 
    ('United Kingdom', 2010, 'UK', 'United Kingdom', 2850, 2800), 
    ('United Kingdom', 2010, 'UK', 'Channel Islands', 2850, 11), 
    ('Cyprus', 2010, 'CYPRUS TURK', 'CYPRUS TURK', 22, 22), 
    ('Cyprus', 2010, 'CYPRUS TURK', 'CYPRUS GRK', 22, 33), 
    ('Cyprus', 2010, 'CYPRUS GRK', 'CYPRUS TURK', 33, 22), 
    ('Cyprus', 2010, 'CYPRUS GRK', 'CYPRUS GRK', 33, 33) 
; 

查詢1

SELECT my_name, Year, SUM(CIA_GDP), WB_GDP 
FROM (
    SELECT my_name, Year, CIA_GDP, SUM(WB_GDP) WB_GDP 
    FROM t 
    GROUP BY my_name, Year, CIA_GDP 
) t1 
GROUP BY my_name, Year, WB_GDP 

Results

|  my_name | Year | SUM(CIA_GDP) | WB_GDP | 
|----------------|------|--------------|--------| 
|   Cyprus | 2010 |   55 |  55 | 
| United Kingdom | 2010 |   2850 | 2811 | 
+0

塞浦路斯(在這個例子中),我需要'CIA_GDP,WB_GDP' –

+0

@ Dani-Br的SUM以及你如何知道你爲塞浦路斯而不是英國做了SUM?爲什麼? – Blag

+0

不錯!但是塞浦路斯的SUM(WB_GDP)應該是55.不是110. –