我已經構建了web應用程序作爲一種工具來消除人們表中不必要的數據,該應用程序主要用於過濾所有有效獲得選舉權的人的所有數據。起初,當主表仍然有少量行時,這並不是問題,但當表充滿大約200K行時(真的更糟,因爲表將高達600萬行),這是非常糟糕的(6秒) 。我在表格設計上錯了,或者在製作表格時在選定的索引中出錯了?
我有像下面這樣的桌子設計,我正在做一個有4張桌子的連接(地區表從省,市,區和城鎮開始)。每個區域表彼此相關的用自己的ID:
CREATE TABLE `peoples` (
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`id_prov` smallint(2) NOT NULL,
`id_city` smallint(2) NOT NULL,
`id_district` smallint(2) NOT NULL,
`id_town` smallint(4) NOT NULL,
`tps` smallint(4) NOT NULL,
`urut_xls` varchar(20) NOT NULL,
`nik` varchar(20) NOT NULL,
`name` varchar(60) NOT NULL,
`place_of_birth` varchar(60) NOT NULL,
`birth_date` varchar(30) NOT NULL,
`age` tinyint(3) NOT NULL DEFAULT '0',
`sex` varchar(20) NOT NULL,
`marital_s` varchar(20) NOT NULL,
`address` varchar(160) NOT NULL,
`note` varchar(60) NOT NULL,
`m_name` tinyint(1) NOT NULL DEFAULT '0',
`m_birthdate` tinyint(1) NOT NULL DEFAULT '0' ,
`format_birthdate` tinyint(1) NOT NULL DEFAULT '0' ,
`m_sex` tinyint(1) NOT NULL DEFAULT '0' COMMENT ,
`m_m_status` tinyint(1) NOT NULL DEFAULT '0' ,
`sex_double` tinyint(1) NOT NULL DEFAULT '0',
`id_import` bigint(10) NOT NULL,
`id_workspace` tinyint(4) unsigned NOT NULL DEFAULT '0',
`stat_valid` smallint(1) NOT NULL DEFAULT '0' ,
`add_manual` tinyint(1) unsigned NOT NULL DEFAULT '0' ,
`insert_by` varchar(12) NOT NULL,
`update_by` varchar(12) DEFAULT NULL,
`mark_as_duplicate` smallint(1) NOT NULL DEFAULT '0' ,
`mark_as_trash` smallint(1) NOT NULL DEFAULT '0' ,
`in_date_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `ind_import` (`id_import`),
KEY `ind_duplicate` (`mark_as_duplicate`),
KEY `id_workspace` (`id_workspace`),
KEY `tambah_manual` (`tambah_manual`),
KEY `il` (`stat_valid`,`mark_as_trash`,`in_date_time`),
KEY `region` (`id_prov`,`id_kab`,`id_kec`,`id_kel`,`tps`),
KEY `name` (`name`),
KEY `place_of_birth` (`place_of_birth`),
KEY `ind_birth` (`birthdate`(10)),
KEY `ind_sex` (`sex`(2))
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
鎮:
CREATE TABLE `town` (
`id` smallint(4) NOT NULL,
`id_district` smallint(2) NOT NULL,
`id_city` smallint(2) NOT NULL,
`id_prov` smallint(2) NOT NULL,
`name_town` varchar(60) NOT NULL,
`handprint` blob,
`pps_1` varchar(60) DEFAULT NULL,
`pps_2` varchar(60) DEFAULT NULL,
`pps_3` varchar(60) DEFAULT NULL,
`tpscount` smallint(2) DEFAULT NULL,
`pps_4` varchar(60) DEFAULT NULL,
`pps_5` varchar(60) DEFAULT NULL,
PRIMARY KEY (`id_prov`,`id_kab`,`id_kec`,`id`),
KEY `name_town` (`name_town`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
和查詢一樣
SELECT `E`.`id`, `E`.`id_prov`, `E`.`id_city`, `E`.`id_district`, `E`.`id_town`,
`B`.`name_prov`,`C`.`name_city`,`D`.`name_district`, `A`.`name_town`,
`E`.`tps`, `E`.`urut_xls`, `E`.`nik`,`E`.`name`,`E`.`place_of_birth`,
`E`.`birth_date`, `E`.age, `E`.`sex`, `E`.`marital_s`, `E`.`address`,
`E`.`note`
FROM peoples E
JOIN test_prov B ON E.id_prov = B.id
JOIN test_city C ON E.id_city = C.id
AND (C.id_prov=B.id)
JOIN test_district D ON E.id_district = D.id
AND ((D.id_city = C.id) AND (D.id_prov= B.id))
JOIN test_town A ON E.id_town = A.id
AND ((A.id_district = D.id)
AND (A.id_city = C.id)
AND (A.id_prov = B.id))
AND E.stat_valid=1
AND E.mark_as_trash=0
mark_as_trash是一個標誌列只包含1和只是爲了知道數據是否被標記爲已刪除的記錄,而stat_valid是已過濾的結果值 - 如果值爲1,則數據有效以獲得選舉權。
我試着去看解釋,但沒有列被用作索引查找。我相信這就是爲什麼應用程序在200K行中如此慢的問題。上面的查詢只顯示兩個條件,但該應用程序具有按姓名,出生地,出生日期,年齡範圍等進行過濾的功能。
我該如何讓這個表現更好?
需要多長時間從人表中取出200K行而沒有任何連接?你能否爲test_prov,test_district和test_city顯示定義?也許使用SQL Fiddle:http://sqlfiddle.com/。你能否提供EXPLAIN輸出? – lowleveldesign
http://sqlfiddle.com/#!2/15e70/1/0在200k行內解釋輸出id,select_type,table,type,possible_keys,key,key_len,ref,rows,filtered,額外 1,SIMPLE,B ,ALL,PRIMARY,NULL,NULL,NULL,3,100.00, 1,SIMPLE,C,ref,PRIMARY,PRIMARY,2,test.B.id,1,100.00, 1,SIMPLE,D,ref,PRIMARY,PRIMARY,4 ,「test.B.id,test.C.id」,1,100.00, 1,SIMPLE,A,ref,PRIMARY,PRIMARY,6,「test.B.id,test.C.id,test.D.id 「,1,100.00, 1,SIMPLE,E,ref,」il,region「,region,8,」test.B.id,test.C.id,test.A.id_district,test.A.id「,18834 ,100.00,「使用哪裏」 – achy