最快的方式,我有3個表看起來像這樣:什麼是加入幾個表匹配特定的列值在MySQL
CREATE TABLE big_table_1 (
id INT(11),
col1 TINYINT(1),
col2 TINYINT(1),
col3 TINYINT(1),
PRIMARY KEY (`id`)
)
等了big_table_2和big_table_3。 col1,col2,col3值爲0,1或null。
我在找每個表中col1值等於1的id's。我加入他們的行列如下,用我能想到的最簡單的方法:
SELECT t1.id
FROM big_table_1 AS t1
INNER JOIN big_table_2 AS t2 ON t2.id = t1.id
INNER JOIN big_table_3 AS t3 ON t3.id = t1.id
WHERE t1.col1 = 1
AND t2.col1 = 1
AND t3.col1 = 1;
由於每個臺10萬行,查詢需要約40秒到我的機器上執行:
407231 rows in set (37.19 sec)
解釋結果:
+----+-------------+-------+--------+---------------+---------+---------+--------------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+--------------+----------+-------------+
| 1 | SIMPLE | t3 | ALL | PRIMARY | NULL | NULL | NULL | 10999387 | Using where |
| 1 | SIMPLE | t1 | eq_ref | PRIMARY | PRIMARY | 4 | testDB.t3.id | 1 | Using where |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | testDB.t3.id | 1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+--------------+----------+-------------+
如果我在col1申報指標,其結果是稍微慢一點:
407231 rows in set (40.84 sec)
我也曾嘗試以下查詢:
SELECT t1.id
FROM (SELECT distinct ta1.id FROM big_table_1 ta1 WHERE ta1.col1=1) as t1
WHERE EXISTS (SELECT ta2.id FROM big_table_2 ta2 WHERE ta2.col1=1 AND ta2.id = t1.id)
AND EXISTS (SELECT ta3.id FROM big_table_3 ta3 WHERE ta3.col1=1 AND ta3.id = t1.id);
但它的速度較慢:
407231 rows in set (44.01 sec) [with index on col1]
407231 rows in set (1 min 36.52 sec) [without index on col1]
是上述簡單的方法基本上是在MySQL這樣做的最快方法?爲了更快地獲得結果,是否有必要將表分成多個服務器?
附錄:根據要求解釋安德魯的代碼的結果(我修剪表下降到只有1萬行,而該指數是對ID和COL1):
+----+-------------+-------------+-------+---------------+---------+---------+------+---------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+---------+---------+------+---------+--------------------------------+
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 332814 | |
| 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 333237 | Using where; Using join buffer |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 333505 | Using where; Using join buffer |
| 4 | DERIVED | big_table_3 | index | NULL | PRIMARY | 5 | NULL | 1000932 | Using where; Using index |
| 3 | DERIVED | big_table_2 | index | NULL | PRIMARY | 5 | NULL | 1000507 | Using where; Using index |
| 2 | DERIVED | big_table_1 | index | NULL | PRIMARY | 5 | NULL | 1000932 | Using where; Using index |
+----+-------------+-------------+-------+---------------+---------+---------+------+---------+--------------------------------+
結果是覆蓋索引快4倍,很好。這將是一個問題,如果我有更多的cols,並不斷選擇不同的。轉換爲連接的速度與使用'where'的速度大致相同。 –
Thx讓我們知道 – AsConfused
如果你可以忍受一個稍微慢點的插入或者更新(在colN列上),並且這是一個頻率需求值得它面對「新數據」慢......然後幾個覆蓋索引索引中涵蓋了所有信息,非常窄(大小)的列組合將使引擎不必前往數據頁面。如果需要的話,如果mysql cbo有腦屁,使用'force index' – AsConfused