2014-08-28 60 views
0

我有多個表,我試圖加入。我在表格中添加了索引以提高速度,但仍需要很長時間才能加入。我懷疑這是預期的,但我想知道是否有更有效的方式來創建多個表的連接。我還將net_read_timeout =設置爲150,因爲我收到丟失的連接錯誤。我的查詢看起來像:優化加入mysql中的很多表格

set net_read_timeout = 150; 
ALTER TABLE wspeed2 ADD INDEX (speed,roadtypeID) --For all the tables 

SELECT a.month,a.roadTypeID,a.speed,a.pid, a.or, b.pid, b.or, c.pid, c.or, d.pid, d.or, 
     e.pid, e.or, f.pid, f.or, g.pid, g.or, h.pid, h.or, i.pid, i.or, j.pid, j.or, 
     k.pid, k.or, l.pid, l.or, m.pid, m.or, n.pid, n.or, o.pid, o.or, p.pid, p.or, 
     q.pid, q.or, r.pid, r.or, s.pid, s.or, t.pid, t.or, u.pid, u.or, v.pid, v.or 
FROM wspeed2 a, wspeed3 b, wspeed20 c, wspeed24 d, wspeed25 e, wspeed26 f, wspeed27 g, wspeed63 h, wspeed65 i, wspeed68 j, 
    wspeed69 k, wspeed70 l, wspeed71 m, wspeed72 n, wspeed73 o, wspeed74 p, wspeed75 q, wspeed76 r, wspeed77 s, wspeed78 t, wspeed81 u, wspeed82 v 
WHERE a.speed = b.speed and b.speed = c.speed and c.speed = d.speed and d.speed = e.speed and e.speed = f.speed and f.speed = g.speed and g.speed = h.speed 
    and h.speed = i.speed and i.speed = j.speed and j.speed = k.speed and k.speed = l.speed and l.speed = m.speed and m.speed = n.speed and n.speed = o.speed 
    and o.speed = p.speed and p.speed = q.speed and q.speed = r.speed and r.speed = s.speed and s.speed = t.speed and t.speed = u.speed and u.speed = v.speed 
GROUP BY a.speed; 
+0

我來自Microsoft的後臺;但是,一般來說,當我有5個以上的表加入時,我會使用臨時表或其他臨時表。 – 2014-08-28 18:39:46

+0

這是SQL數據庫的主要問題。有沒有更好的方式來做到這一點,我知道 – ControlAltDel 2014-08-28 18:40:33

回答

0

雖然查詢本身看起來很簡單但很奇怪,這裏是連接。注意......因爲你有a = bb = cc = d等等......它也意味着a = ra = sa = t等等......所以,不是所有的人都依賴它前面的別名,它可能會幫助你引擎將所有其他速度表直接連接到根「a」級別別名,如下所示。這就是說,如果一個或多個表沒有「a」表中相應速度的記錄,它們將不會出現在結果集中。如果您希望ALL在另一個表中有匹配,請將它們全部更改爲LEFT JOIN。

現在,看着你的「a」表,你基於道路類型和每個月的速度。速度欄是獨特的欄目嗎?我會認爲這是,但不是積極的。如果任何基礎表被加入到每個速度值相同的記錄超過1條,您將得到一個笛卡兒結果,並且可能會窒息您的查詢。

此外,你有一個組,但沒有聚合函數列,如SUM(something),count(),avg(),min(),max(),那麼這個組的點是什麼。有時,您可能希望訂購的東西(最好的東西與指數的「一」表

SELECT 
     a.month, a.roadTypeID, a.speed, 
     a.pid, a.or, b.pid, b.or, c.pid, c.or, d.pid, d.or, 
     e.pid, e.or, f.pid, f.or, g.pid, g.or, h.pid, h.or, 
     i.pid, i.or, j.pid, j.or, k.pid, k.or, l.pid, l.or, 
     m.pid, m.or, n.pid, n.or, o.pid, o.or, p.pid, p.or, 
     q.pid, q.or, r.pid, r.or, s.pid, s.or, t.pid, t.or, 
     u.pid, u.or, v.pid, v.or 
    FROM 
     wspeed2 a 
     JOIN wspeed3 b on a.speed = b.speed 
     JOIN wspeed20 c on a.speed = c.speed 
     JOIN wspeed24 d on a.speed = d.speed 
     JOIN wspeed25 e on a.speed = e.speed 
     JOIN wspeed26 f on a.speed = f.speed 
     JOIN wspeed27 g on a.speed = g.speed 
     JOIN wspeed63 h on a.speed = h.speed 
     JOIN wspeed65 i on a.speed = i.speed 
     JOIN wspeed68 j on a.speed = j.speed 
     JOIN wspeed69 k on a.speed = k.speed 
     JOIN wspeed70 l on a.speed = l.speed 
     JOIN wspeed71 m on a.speed = m.speed 
     JOIN wspeed72 n on a.speed = n.speed 
     JOIN wspeed73 o on a.speed = o.speed 
     JOIN wspeed74 p on a.speed = p.speed 
     JOIN wspeed75 q on a.speed = q.speed 
     JOIN wspeed76 r on a.speed = r.speed 
     JOIN wspeed77 s on a.speed = s.speed 
     JOIN wspeed78 t on a.speed = t.speed 
     JOIN wspeed81 u on a.speed = u.speed 
     JOIN wspeed82 v on a.speed = v.speed 

如果仍然沒有幫助,也許加入MySQL的關鍵字「STRAIGHT_JOIN」可能有幫助,如:

選擇STRAIGHT_JOIN [查詢的休息]

+0

@丹尼爾,很高興它似乎爲你工作。你介意讓我知道查詢的性能改進。這對其他具有實際數據和類似多個連接的人可能會有好處。瞭解此前後的查詢時間之前/之後是很好的瞭解。 – DRapp 2014-08-29 02:06:54

+0

是的,沒問題。在查看並嘗試查詢結構之後,我決定也包含所有可匹配所有表的pk。在這種情況下,pk由monthid,roadtype和speed組成。當使用這個結構進行連接時,查詢的速度從5分鐘提高到了一秒。 – 2014-08-29 15:49:54

0

使用內部和左/右連接會給你更好的性能。嘗試用這種方法重寫查詢 -

select ... from t1 
innerjoin t2 on t1.pk=t2.fk 
leftjoin t3 on t1.pk=t3.fk 
+1

我沒有看到爲什麼MySQL不會優化隱式和顯式連接以產生相同的執行路徑。你爲什麼認爲明確的連接會帶來更好的表現? – 2014-08-28 19:08:02

+0

如果3個表各有100行。在「FROM t1,t2」方法中,它將首先創建1000000行然後進行過濾。 但在「FROM t1 join t2 ON ..」中它將只創建匹配的行。 – 2014-08-28 19:16:28

+1

@Biswajit,你確定嗎?這是一個非常基本的優化。我發現很難相信MySQL不知情。 – 2014-08-28 19:21:55

0

如果speed列不在這些表中唯一的(而且可能它不是,因爲你說你添加一個索引與speed作爲國內領先的列...

如果t這裏有多個行的值爲speed,在這些表中,那麼您的查詢可能會創建一箇中間集合。

讓我們來做一些簡單的數學。如果每個表中有兩行具有相同的速度值,則a和b之間的JOIN操作將爲該速度創建4行。當我們將連接添加到c時,還有另外兩行,這總共有8行。當我們把所有22個表連接起來,每個表都有兩行時,我們在2^22或超過400萬行。然後,需要在GROUP BY操作中處理所有具有speed的所有相同值的整行行以消除重複項。

(當然,如果表中的任何一個不具有一排同樣speed值,那麼查詢會產生零行爲speed。)

就個人而言,我會溝老用於JOIN操作的-school逗號語法,並改爲使用JOIN關鍵字。我將WHERE子句中的連接謂詞移動到適當的ON子句中。

我也想讓其中一個表作爲所有連接的「驅動程序」,我會在每個連接中使用對同一個表的引用。 (我們知道,如果a=bb=c,然後a=c。但我不知道MySQL優化,無論任何區別,我們是否到位a=b and b=c指定a=b and a=c

如果有不同的值相對較少的數量在每個表中有speed,但是很多具有相同值的行,我會考慮使用內聯視圖爲每​​個表的每個速度獲取一行。MySQL可以使用合適的索引來優化GROUP BY在每個單獨的表上操作...我會選擇覆蓋索引...例如

ON wspeed20 (speed, pid, `or`) 
ON wspeed24 (speed, pid, `or`) 

Unfor可調節地,派生表(內聯視圖查詢的結果)未被索引,所以JOIN操作可能很昂貴(對於來自每個內聯視圖查詢的許多行)。

SELECT a.month,a.roadTypeID,a.speed,a.pid,a.or, b.pid, b.or, c.pid, c.or, d.pid, d.or, 
    e.pid, e.or, f.pid, f.or, g.pid, g.or, h.pid, h.or, i.pid, i.or, j.pid, j.or, 
    k.pid, k.or, l.pid, l.or, m.pid, m.or, n.pid, n.or, o.pid, o.or, p.pid, p.or, 
    q.pid, q.or, r.pid, r.or, s.pid, s.or, t.pid, t.or, u.pid, u.or, v.pid, v.or 

    FROM (SELECT speed, pid, `or` FROM wspeed2 GROUP BY speed) a 
    JOIN (SELECT speed, pid, `or` FROM wspeed3 GROUP BY speed) b ON b.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed20 GROUP BY speed) c ON c.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed24 GROUP BY speed) d ON d.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed25 GROUP BY speed) e ON e.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed26 GROUP BY speed) f ON f.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed27 GROUP BY speed) g ON g.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed63 GROUP BY speed) h ON h.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed65 GROUP BY speed) i ON i.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed68 GROUP BY speed) j ON j.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed69 GROUP BY speed) k ON k.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed70 GROUP BY speed) l ON l.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed71 GROUP BY speed) m ON m.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed72 GROUP BY speed) n ON n.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed73 GROUP BY speed) o ON o.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed74 GROUP BY speed) p ON p.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed75 GROUP BY speed) q ON q.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed76 GROUP BY speed) r ON r.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed77 GROUP BY speed) s ON s.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed78 GROUP BY speed) t ON t.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed81 GROUP BY speed) u ON u.speed = a.speed 
    JOIN (SELECT speed, pid, `or` FROM wspeed82 GROUP BY speed) v ON v.speed = a.speed 

這不得不減少對需要連接的行數的潛力(同樣,如果有大量重複值的speed,併爲speed少數不同的值)。但是同樣,派生表之間的JOIN操作不會有任何可用的索引。 (至少,在MySQL版本中不能高達5.6)。