2017-04-13 65 views
0

我有這樣的一個表:MySql的GROUP BY使用文件排序 - 查詢優化

CREATE TABLE `purchase` (
    `fact_purchase_id` binary(16) NOT NULL, 
    `purchase_id` int(10) unsigned NOT NULL, 
    `purchase_id_primary` int(10) unsigned DEFAULT NULL, 
    `person_id` int(10) unsigned NOT NULL, 
    `person_id_owner` int(10) unsigned NOT NULL, 
    `service_id` int(10) unsigned NOT NULL, 
    `fact_count` int(10) unsigned NOT NULL DEFAULT '0', 
    `fact_type` tinyint(3) unsigned NOT NULL, 
    `date_fact` date NOT NULL, 
    `purchase_name` varchar(255) DEFAULT NULL, 
    `activation_price` decimal(7,2) unsigned NOT NULL DEFAULT '0.00', 
    `activation_price_total` decimal(7,2) unsigned NOT NULL DEFAULT '0.00', 
    `renew_price` decimal(7,2) unsigned DEFAULT '0.00', 
    `renew_price_total` decimal(7,2) unsigned NOT NULL DEFAULT '0.00', 
    `activation_cost` decimal(7,2) unsigned DEFAULT '0.00', 
    `activation_cost_total` decimal(7,2) unsigned NOT NULL DEFAULT '0.00', 
    `renew_cost` decimal(7,2) unsigned DEFAULT '0.00', 
    `renew_cost_total` decimal(7,2) unsigned NOT NULL DEFAULT '0.00', 
    `date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, 
    PRIMARY KEY (`fact_purchase_id`), 
    KEY `purchase_id_idx` (`purchase_id`), 
    KEY `person_id_idx` (`person_id`), 
    KEY `person_id_owner_idx` (`person_id_owner`), 
    KEY `service_id_idx` (`service_id`), 
    KEY `fact_type_idx` (`fact_type`), 
    KEY `renew_price_idx` (`renew_price`), 
    KEY `renew_cost_idx` (`renew_cost`), 
    KEY `renew_price_year_idx` (`renew_price_year`), 
    KEY `renew_cost_year_idx` (`renew_cost_year`), 
    KEY `date_created_idx` (`date_created`), 
    KEY `purchase_id_primary_idx` (`purchase_id_primary`), 
    KEY `fact_count` (`fact_count`), 
    KEY `renew_price_year_total_idx` (`renew_price_total`), 
    KEY `renew_cost_year_total_idx` (`renew_cost_total`), 
    KEY `date_fact` (`date_fact`) USING BTREE, 
    CONSTRAINT `purchase_person_fk` FOREIGN KEY (`person_id`) REFERENCES `person` (`person_id`) ON DELETE NO ACTION ON UPDATE NO ACTION, 
    CONSTRAINT `purchase_person_owner_fk` FOREIGN KEY (`person_id_owner`) REFERENCES `person` (`person_id`) ON DELETE NO ACTION ON UPDATE NO ACTION, 
    CONSTRAINT `purchase_service_fk` FOREIGN KEY (`service_id`) REFERENCES `service` (`service_id`) ON DELETE NO ACTION ON UPDATE NO ACTION 
) ENGINE=InnoDB DEFAULT CHARSET=utf8; 

我推出這個查詢:

SELECT 
    purchase.date_fact, 
    UNIX_TIMESTAMP(purchase.date_fact), 
    COUNT(DISTINCT purchase.purchase_id) AS Num 
FROM 
    purchase 
WHERE 
    purchase.date_fact >= '2017-01-01' 
    AND purchase.date_fact <= '2017-01-31' 
    AND purchase.fact_type = 3 
    AND purchase.purchase_id_primary IS NULL 
GROUP BY purchase.date_fact 

該表一共包含了5.629.670記錄,上運行查詢的EXPLAIN我得到這些結果:

  • rows = 2.814.835
  • possible_keys = fact_type_idx,purchase_id_primary_idx,date_fact
  • key = fact_type_idx
  • key_len = 1
  • ref = const
  • filtered = 25.00
  • Extra = Using index condition;Using where;Using filesort

的查詢接受30-35開環nd被執行。這太久了,無法等待。

問題是GROUP BY導致文件被應用。 ORDER BY NULL應用於查詢不會更改任何內容

我可以使用覆蓋索引,但我只需要在這個查詢中的date_fact:我可以使用哪些字段?

如何避免GROUP BY上的文件夾?我如何優化查詢以使其更快?

我將此表用於統計目的(OLAP)。也許有更好的DBMS用於這個目的嗎?

我正在運行MySql Server 5.7.17。

謝謝

回答

2

對於此查詢:

SELECT p.date_fact, UNIX_TIMESTAMP(p.date_fact), 
     COUNT(DISTINCT p.purchase_id) AS Num 
FROM purchase p 
WHERE p.date_fact >= '2017-01-01' AND 
     p.date_fact <= '2017-01-31' AND 
     p.fact_type = 3 AND 
     p.purchase_id_primary IS NULL 
GROUP BY p.date_fact; 

我會建議在(fact_type, purchase_id_primary, date_fact, purchase_id)一個複合索引。前兩個鍵在WHERE中具有相等條件。第三個是不等式,第四個允許索引「覆蓋」查詢(查詢中的所有列都在索引中)。

我還會補充一句:如果你不需要COUNT(DISTINCT),那就不要使用它。 purchase_idpurchase中可能已經是唯一的。