2009-12-10 81 views
0

我目前有2個表用於簡單連接的選擇查詢。第一張桌子大約有6-9百萬行,這被用作連接。主表是從1mil到300mil行的任何地方。不過,我注意到,當我在主表上加入10mil以上的行時,select查詢從即時變爲非常慢(3+秒和增長)。連接的MySQL結構幫助(大表)

這是我的表結構和查詢。

CREATE TABLE IF NOT EXISTS `links` (
    `link_id` int(10) unsigned NOT NULL, 
    `domain_id` mediumint(7) unsigned NOT NULL, 
    `parent_id` int(11) unsigned DEFAULT NULL, 
    `hash` int(10) unsigned NOT NULL, 
    `url` text NOT NULL, 
    `type` enum('html','pdf') DEFAULT NULL, 
    `processed` enum('N','Y') NOT NULL DEFAULT 'N', 
    UNIQUE KEY `hash` (`hash`), 
    KEY `idx_processed` (`processed`), 
    KEY `domain_id` (`domain_id`) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT; 


CREATE TABLE IF NOT EXISTS `domains` (
    `domain_id` mediumint(7) unsigned NOT NULL AUTO_INCREMENT, 
    `name` varchar(170) NOT NULL, 
    `blocked` enum('N','Y') NOT NULL DEFAULT 'N', 
    `count` mediumint(6) NOT NULL DEFAULT '0', 
    `mcount` mediumint(3) NOT NULL, 
    PRIMARY KEY (`domain_id`), 
    KEY `name` (`name`), 
    KEY `blocked` (`blocked`), 
    KEY `mcount` (`mcount`), 
    KEY `count` (`count`) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=10834389 ; 

查詢:

(SELECT link_id, url, hash FROM links, domains WHERE links.domain_id = domains.domain_id and mcount > 1 and processed='N' limit 200) 
UNION 
(SELECT link_id, url, hash FROM links where processed='N' and type='html' limit 200) 

解釋選擇:

| id | select_type | table  | type | possible_keys   | key  |  key_len | ref      | rows | Extra  | 
+----+--------------+------------+-------+-------------------------+---------------  +---------+---------------------------+---------+-------------+ 
| 1 | PRIMARY  | domains | range | PRIMARY,mcount   | mcount  | 3  | NULL      | 257673 | Using where | 
| 1 | PRIMARY  | links  | ref | idx_processed,domain_id | domain_id  | 3  | crawler.domains.domain_id |  1 | Using where | 
| 2 | UNION  | links  | ref | idx_processed   | idx_processed | 1  | const      | 7090017 | Using where | 
| NULL | UNION RESULT | <union1,2> | ALL | NULL     | NULL   | NULL | NULL      | NULL |    | 
+----+--------------+------------+-------+-------------------------+---------------+---------+---------------------------+---------+-------------+ 

現在,我嘗試使用域ID爲關鍵環節,以20個分區的分區。

任何其他選項將不勝感激。

+0

這是發生緩慢的第一個查詢,而不是第二個查詢。我應該在之前提到過。抱歉。 – Josh

+0

您是否也可以描述您的索引,並顯示哪些字段來自緩慢查詢中的哪個表。 – Unreason

回答

0

一個SELECT語句會取代你的整個聯合聲明:

SELECT link_id, url, hash 
FROM links, domains 
WHERE links.domain_id = domains.domain_id 
     AND mcount > 1 
     AND processed='N' 
     AND type='html' 

這可能不是你要找的答案,但它應該幫助您簡化問題。

0

當事情突然變慢時,您可能需要檢查索引的大小(用於查詢執行中)與各種mysql緩衝區的大小。