2013-06-02 42 views
0

我正在尋求優化多個自連接或更好的表/數據庫設計的建議。優化多個自動JOIN或重新設計數據庫?

其中一個表如下所示(相關的cols只):

CREATE TABLE IF NOT EXISTS CountryData (
    countryDataID INT PRIMARY KEY AUTO_INCREMENT, 
    dataID INT NOT NULL REFERENCES DataSources (dataID), 
    dataCode VARCHAR(30) NULL, 
    countryID INT NOT NULL REFERENCES Countries (countryID), 
    year INT NOT NULL , 
    data DEC(20,4) NULL, 
    INDEX countryDataYear (dataID, countryID, year)); 

data列有幾百指標,90個國家,30歲〜1MN行合計值。標準查詢要求爲特定年份和C國家選擇N個指標,產生最多90行的CxN表。

將所有的值都放在一個列中,自連接似乎就是要走的路。所以我嘗試了各種建議來加速這些建議,包括索引和創建新(臨時)表。在9個自連接處,查詢需要1分鐘以下。除此之外,它永遠旋轉。

從那裏自聯接發生大約只有1000行,索引什麼似乎是相關變量的新表 - 創作時間約0.5秒:

CREATE TABLE Growth 
    SELECT dataID, countryID, year, data 
    FROM CountryData 
    WHERE dataID > 522 AND year = 2017; 

CREATE INDEX growth_ix 
    ON Growth (dataID, countryID); 

SELECT查詢再安排了到XX指標結果表中,有XX不幸< 10:

SELECT 
    Countries.countryName AS Country, 
    em01.em, 
    em02.em, 
    em03.em 
    ... 
    emX.em 
FROM  
    (SELECT 
     em1.data AS em, 
     em1.countryID 
    FROM Growth AS em1 
    WHERE 
    em1.dataID = 523) as em01 
    JOIN 
    (SELECT 
     em2.data AS em, 
     em2.countryID 
    FROM Growth AS em2 
    WHERE 
    em2.dataID = 524) as em02 
    USING (countryID) 
    JOIN 
    (SELECT 
     em3.data AS em, 
     em3.countryID 
    FROM Growth AS em3 
    WHERE 
    em3.dataID = 525) as em03 
    USING (countryID) 
    ... 
    JOIN 
    (SELECT 
     emX.data AS em, 
     emX.countryID 
    FROM Growth AS em5 
    WHERE 
    emX.dataID = 527) as emXX 
    USING (countryID) 
    JOIN Countries 
    USING (countryID) 

其實我想取回幾個變量,加上可能加入其他表。現在我想知道是否有辦法更有效地運行它,或者我應該採取一種完全不同的方法,例如使用帶有不同列中指標的寬表來避免自連接。

+0

這可能屬於[dba.se] –

回答

0

是給定countryIDyear或者可以在dataID出現多次使用不同的值唯一的dataID?如果它是唯一的,你可以嘗試這樣的事情?

SELECT countryID, year 
    ,MAX(CASE WHEN dataID = 523 THEN data ELSE NULL END) AS em0 
    ,MAX(CASE WHEN dataID = 524 THEN data ELSE NULL END) AS em1 
    ,... 
FROM CountryData 
GROUP BY countryID, year