2011-09-09 31 views
1

需要FIND_IN_SET更快的替代方案如果在TABLE2逗號分隔列上找到TABLE1 ID,我想執行select。當連接表

table1: 
+---------+ 
| id  | 
+---------+ 
|12345678 | 
+---------+ 
|12322222 | 
+---------+ 

table2: 
+---------------------------------------------+ 
| manyids          | 
+---------------------------------------------+ 
|12345678,1111111,2222233,0000111,65321  | 
+---------------------------------------------+ 
|2222233,12322222        | 
+---------------------------------------------+ 
|12322222          | 
+---------------------------------------------+ 

這是工作的罰款在較小的測試表:

SELECT table1.id, 
COUNT(table1.id) AS occurences 
FROM table1 JOIN table2 ON FIND_IN_SET(table1.id, table2.manyids) > 0 
GROUP BY table1.id HAVING occurences > 0 
ORDER BY occurences DESC 

但是實際TABLE1我想執行選擇擁有超過50萬行和FIND_IN_SET實在是太慢了。任何替代品?

+3

讓您的數據歸一化。 – ajreal

+0

Table2中的行可能如何? –

+1

@amoult,永遠不會在數據庫表中使用CSV。這是最糟糕的反模式之一。 – Johan

回答

6

唯一明智替代的方法是標準化的表:

table tag 
--------- 
id integer auto_increment primary key 
name varchar(40) 

table article 
------------- 
id integer auto_increment primary key 
title varchar(1000) 
content text 

table tag_link 
-------------- 
article_id integer foreign key references article(id) 
tag_id integer foreign key references tag(id) 
primary key article_id, tag_id 

因爲所有的字段建立索引,您可以方便的查詢和非常非常快的,像這樣:

SELECT t.name FROM article AS a 
INNER JOIN tag_link tl ON (tl.article_id = a.id) 
INNER JOIN tag t ON (t.id = tl.tag_id) 
WHERE a.id = '45785' 

選項2壞主意,比選項1差得多
如果您確實無法更改設置,請在字段manyids上創建一個fulltext索引。

並更改查詢:

SELECT table1.id, 
COUNT(table1.id) AS occurences 
FROM table1 
JOIN table2 ON MATCH(table2.manyids) 
       AGAINST (CONCAT("+'",table1.id,"'") IN BOOLEAN MODE) 
/*boolean mode is required*/ 
GROUP BY table1.id HAVING occurences > 0 
ORDER BY occurences DESC 

如果有id爲停止字中的它不會匹配。請注意,此列表中沒有數字。

鏈接
http://dev.mysql.com/doc/refman/5.5/en/fulltext-stopwords.html
http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html

請注意,您將需要調整最小和最大字長的全文索引考慮:見:http://dev.mysql.com/doc/refman/5.5/en/fulltext-fine-tuning.html