2011-04-14 40 views
0

這裏搜索行的查詢:如何優化MySQL查詢,在一個特定的日期範圍

select timespans.id as timespan_id, count(*) as num 
from reports, timespans 
where timespans.after_date >= '2011-04-13 22:08:38' and 
     timespans.after_date <= reports.authored_at and 
     reports.authored_at < timespans.before_date 
group by timespans.id; 

下面是表DEFS:

 
CREATE TABLE `reports` (
    `id` int(11) NOT NULL auto_increment, 
    `source_id` int(11) default NULL, 
    `url` varchar(255) default NULL, 
    `lat` decimal(20,15) default NULL, 
    `lng` decimal(20,15) default NULL, 
    `content` text, 
    `notes` text, 
    `authored_at` datetime default NULL, 
    `created_at` datetime default NULL, 
    `updated_at` datetime default NULL, 
    `data` text, 
    `title` varchar(255) default NULL, 
    `author_id` int(11) default NULL, 
    `orig_id` varchar(255) default NULL, 
    PRIMARY KEY (`id`), 
    KEY `index_reports_on_title` (`title`), 
    KEY `index_content_on_reports` (`content`(128)) 

CREATE TABLE `timespans` (
    `id` int(11) NOT NULL auto_increment, 
    `after_date` datetime default NULL, 
    `before_date` datetime default NULL, 
    `after_offset` int(11) default NULL, 
    `before_offset` int(11) default NULL, 
    `is_common` tinyint(1) default NULL, 
    `created_at` datetime default NULL, 
    `updated_at` datetime default NULL, 
    `is_search_chunk` tinyint(1) default NULL, 
    `is_day` tinyint(1) default NULL, 
    PRIMARY KEY (`id`), 
    KEY `index_timespans_on_after_date` (`after_date`), 
    KEY `index_timespans_on_before_date` (`before_date`) 

這裏是解釋:

 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+ 
| id | select_type | table  | type | possible_keys            | key       | key_len | ref | rows | Extra          | 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+ 
| 1 | SIMPLE  | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9  | NULL |  84 | Using where; Using temporary; Using filesort | 
| 1 | SIMPLE  | reports | ALL | NULL               | NULL       | NULL | NULL | 183297 | Using where         | 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+----------------------------------------------+ 

這裏是我在authored_at上創建索引後的解釋。正如你可以看到,指數實際上並不習慣(我認爲...)

 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+ 
| id | select_type | table  | type | possible_keys            | key       | key_len | ref | rows | Extra           | 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+ 
| 1 | SIMPLE  | timespans | range | index_timespans_on_after_date,index_timespans_on_before_date | index_timespans_on_after_date | 9  | NULL |  86 | Using where; Using temporary; Using filesort | 
| 1 | SIMPLE  | reports | ALL | index_reports_on_authored_at         | NULL       | NULL | NULL | 183317 | Range checked for each record (index map: 0x8) | 
+----+-------------+-----------+-------+--------------------------------------------------------------+-------------------------------+---------+------+--------+------------------------------------------------+ 

大約有142K行的報告表,並要少得多的時間跨度表。

查詢大約需要3秒鐘。

奇怪的是,如果我在reports.authored_at上添加索引,它實際上會使查詢速度變慢,大約20秒。我原以爲它會做相反的事情,因爲它可以很容易地在範圍的任何一端查找報告,並將剩餘的報告扔掉,而不必檢查所有報告。

有人能澄清?我很難過。

+3

請把你的解釋結果和你的表格定義,tkx – Neo 2011-04-14 04:48:37

+0

真的應該有'reports.authored_at'上的索引。 EXPLAIN在該列被索引後說什麼? – Wiseguy 2011-04-14 05:10:07

回答

1

而不是兩個單獨的時間表索引,嘗試將它們合併到單個索引中具有before_date和after_date的單個多列索引。然後將該索引添加到authored_at中。

1

我重寫你這樣的查詢:表

select t.id, count(*) as num from timespans t 
    join reports r where t.after_date >= '2011-04-13 22:08:38' 
    and r.authored_at >= '2011-04-13 22:08:38' 
    and r.authored_at < t.before_date 
group by t.id order by null; 

和變化指標

alter table reports add index authored_at_idx(authored_at); 
+0

令人驚歎!儘管r.authored at應該與t.after_date比較,而不是字面值。但是這絕對可以解決它。就我所能看到的唯一真正的區別是比較的方向,將r.authored_at放在左側使其更快。我不知道這有什麼不同! – user707270 2011-04-14 12:45:24

+0

@ user707270你能想到的是,MySQL是沒有那麼聰明,知道t.authored> = t.after_date是相同r.authored> =「2011-04-13 22點08分38秒」 – Neo 2011-04-14 14:52:48

0

您可以after_date柱使用的數據庫的分區功能。它會幫助你很多。