2013-07-26 62 views
0

我有一個相對複雜的查詢,這裏是小提琴:http://sqlfiddle.com/#!2/65c66/12/0優化查詢沒有文件排序使用

SELECT p.title AS title_1, 
     p2.title AS title_2, 
     COUNT(DISTINCT s.signature_id) AS num_signers, 
     group_concat(DISTINCT s.signature_id separator ' ') AS signers 
FROM wtp_data_petitions p 
JOIN wtp_data_petitions p2 ON (p.serial > p2.serial) 
JOIN wtp_data_signatures s 
GROUP BY s.signature_id 
HAVING sum(s.petition_id=p.id) 
AND sum(s.petition_id=p2.id); 

這裏是EXPLAIN(顯示我有實際數據集中的行數,不sqlfiddle):

+----+-------------+-------+-------+---------------+--------------+---------+------+----------+---------------------------------+ 
| id | select_type | table | type | possible_keys | key   | key_len | ref | rows  | Extra       | 
+----+-------------+-------+-------+---------------+--------------+---------+------+----------+---------------------------------+ 
| 1 | SIMPLE  | p  | ALL | PRIMARY  | NULL   | NULL | NULL |  1727 | Using temporary; Using filesort | 
| 1 | SIMPLE  | p2 | ALL | PRIMARY  | NULL   | NULL | NULL |  1727 | Using where; Using join buffer | 
| 1 | SIMPLE  | s  | index | NULL   | signature_id | 105  | NULL | 12943894 | Using index; Using join buffer | 
+----+-------------+-------+-------+---------------+--------------+---------+------+----------+---------------------------------+ 

在這一點上,查詢使用與文件排序是我還沒有這麼多的磁盤空間,看看它是錯誤之前成功完成。我可以執行哪些優化來更快或更高效地執行此操作?

謝謝!

回答

1

是的。有一兩件事你可以做的是移動的加入條件的on條款:

SELECT p.title AS title_1, 
     p2.title AS title_2, 
     COUNT(DISTINCT s.signature_id) AS num_signers, 
     group_concat(DISTINCT s.signature_id separator ' ') AS signers 
FROM wtp_data_petitions p 
JOIN wtp_data_petitions p2 ON (p.serial > p2.serial) 
JOIN wtp_data_signatures s on s.petition_id=p.id or s.petition_id=p2.id 
GROUP BY s.signature_id; 

我也覺得group by應該是p.title, p2.title

SELECT p.title AS title_1, 
     p2.title AS title_2, 
     COUNT(DISTINCT s.signature_id) AS num_signers, 
     group_concat(DISTINCT s.signature_id separator ' ') AS signers 
FROM wtp_data_petitions p 
JOIN wtp_data_petitions p2 ON (p.serial > p2.serial) 
JOIN wtp_data_signatures s on s.petition_id=p.id or s.petition_id=p2.id 
GROUP BY p.title, p2.title; 

可是,爲什麼你在做第二個連接?我不確定查詢應該做什麼。

編輯:

我想你想的基本查詢:

select s1.petition_id, s2.petition_id, count(*) as numsignatures, 
     group_concat(s1.signature_id) as signatures 
from wtp_data_signatures s1 join 
    wtp.data_signatures s2 
    on s1.signature_id = s2.signature_id and 
     s1.petition_id < s2.petition_id 
group by s1.petition_id, s2.petition_id; 

您現在可以擴展,以包括信訪信息:

select p1.title as title_1, p2.title as title_2, 
     s1.petition_id, s2.petition_id, count(*) as numsignatures, 
     group_concat(s1.signature_id) as signatures 
from wtp_data_signatures s1 join 
    wtp.data_signatures s2 
    on s1.signature_id = s2.signature_id and 
     s1.petition_id < s2.petition_id join 
    wtp_data_petitions p1 
    on p1.id = s1.petition_id join 
    wtp_data_petitions p2 
    ON p2.id = s2.petition_id 
group by s1.petition_id, s2.petition_id; 
+0

它應該得到2個請願ID的每個唯一組合,簽名它們的signature_id的數量,以及這些signature_id的空格分隔列表。 –

0

你有一個指數串行?在p.serial> p2.serial上的自連接看起來像是需要對wtp_data_petitions進行排序的唯一原因。嘗試添加索引。