我有兩個表(產品和供應商),並且想要查明哪些項目不再列在供應商表中。高效的MySQL查詢來查找A中不匹配的條目B
表uc_products有產品。表uc_supplier_csv有供應商庫存。 uc_products.model加入uc_suppliers.sku。
當試圖識別供應商表中未涉及的產品表中的庫存時,我看到很長的查詢。我只想提取匹配項的nid; sid IS NULL就是這樣,我可以識別哪些項目沒有供應商。
對於下面的第一個查詢,每小時需要數據庫服務器(4GB ram/2x 2.4GHz intel)才能得到結果(507行)。我沒有等待第二個查詢完成。
如何使此查詢更優化?是否由於不匹配的字符集?
我在想,下面將是最有效的SQL使用:
SELECT nid, sid
FROM uc_products p
LEFT OUTER JOIN uc_supplier_csv c
ON p.model = c.sku
WHERE sid IS NULL ;
對於此查詢,我得到以下EXPLAIN結果:
mysql> EXPLAIN SELECT nid, sid FROM uc_products p LEFT OUTER JOIN uc_supplier_csv c ON p.model = c.sku WHERE sid IS NULL;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
| 1 | SIMPLE | p | ALL | NULL | NULL | NULL | NULL | 6526 | |
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 126639 | Using where; Not exists |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------------------+
2 rows in set (0.00 sec)
我會認爲密鑰idx_sku和idx_model在這裏可以使用,但它們不是。是因爲表的默認字符集不匹配?一個是UTF-8,另一個是latin1。
我也被認爲是這種形式:
SELECT nid
FROM uc_products
WHERE model
NOT IN (
SELECT DISTINCT sku FROM uc_supplier_csv
) ;
EXPLAIN顯示了該查詢的結果如下:
mysql> explain select nid from uc_products where model not in (select sku from uc_supplier_csv) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | uc_products | ALL | NULL | NULL | NULL | NULL | 6520 | Using where |
| 2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
而且,這樣我就不會錯過任何出,這裏更多的是一些令人興奮詳細信息:表尺寸和統計,表結構:)
mysql> show table status where Name in ('uc_supplier_csv', 'uc_products') ;
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| uc_products | MyISAM | 10 | Dynamic | 6520 | 89 | 585796 | 281474976710655 | 232448 | 912 | NULL | 2009-04-24 11:03:15 | 2009-10-12 14:23:43 | 2009-04-24 11:03:16 | utf8_general_ci | NULL | | |
| uc_supplier_csv | MyISAM | 10 | Dynamic | 126639 | 26 | 3399704 | 281474976710655 | 5864448 | 0 | NULL | 2009-10-12 14:28:25 | 2009-10-12 14:28:25 | 2009-10-12 14:28:27 | latin1_swedish_ci | NULL | | |
+-----------------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
和
CREATE TABLE `uc_products` (
`vid` mediumint(9) NOT NULL default '0',
`nid` mediumint(9) NOT NULL default '0',
`model` varchar(255) NOT NULL default '',
`list_price` decimal(10,2) NOT NULL default '0.00',
`cost` decimal(10,2) NOT NULL default '0.00',
`sell_price` decimal(10,2) NOT NULL default '0.00',
`weight` float NOT NULL default '0',
`weight_units` varchar(255) NOT NULL default 'lb',
`length` float unsigned NOT NULL default '0',
`width` float unsigned NOT NULL default '0',
`height` float unsigned NOT NULL default '0',
`length_units` varchar(255) NOT NULL default 'in',
`pkg_qty` smallint(5) unsigned NOT NULL default '1',
`default_qty` smallint(5) unsigned NOT NULL default '1',
`unique_hash` varchar(32) NOT NULL,
`ordering` tinyint(2) NOT NULL default '0',
`shippable` tinyint(2) NOT NULL default '1',
PRIMARY KEY (`vid`),
KEY `idx_model` (`model`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
CREATE TABLE `uc_supplier_csv` (
`sid` int(10) unsigned NOT NULL default '0',
`sku` varchar(255) default NULL,
`stock` int(10) unsigned NOT NULL default '0',
`list_price` decimal(8,2) default '0.00',
KEY `idx_sku` (`sku`),
KEY `idx_stock` (`stock`),
KEY `idx_sku_stock` (`sku`,`stock`),
KEY `idx_sid` (`sid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
編輯:從馬丁下面幾個建議的查詢添加查詢計劃:
mysql> explain SELECT nid FROM uc_products p WHERE NOT EXISTS (SELECT 1 FROM uc_supplier_csv c WHERE p.model = c.sku) ;
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | p | ALL | NULL | NULL | NULL | NULL | 6526 | Using where |
| 2 | DEPENDENT SUBQUERY | c | index | NULL | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-------+-------+---------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
mysql> explain SELECT nid FROM uc_products WHERE model NOT IN (SELECT sku FROM uc_supplier_csv) ;
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
| 1 | PRIMARY | uc_products | ALL | NULL | NULL | NULL | NULL | 6526 | Using where |
| 2 | DEPENDENT SUBQUERY | uc_supplier_csv | index | idx_sku,idx_sku_stock | idx_sku | 258 | NULL | 126639 | Using where; Using index |
+----+--------------------+-----------------+-------+-----------------------+---------+---------+------+--------+--------------------------+
2 rows in set (0.00 sec)
您使用在第一個查詢爲是不正確 - 因爲沒有GROUP BY,它應該是一個簡單的哪裏。不知道爲什麼MySQL不給你一個錯誤消息,但我想這就是搞砸了查詢計劃! – 2009-10-12 04:28:58
謝謝亞歷克斯 - 更新 – 2009-10-12 09:01:13
我昨天在我的筆記本電腦上測試了這個頁面上的四個查詢表單(MBP2.4GHz/4GB/OSX/MAMP MySQL)。 *上面的LEFT OUTER JOIN表單需要3526s才能執行。 *上面的子查詢表格執行了1021s。 *馬丁的建議下面花了637s執行。 *詹姆斯的速度比馬丁的速度略快,但是與其他三種形式的結果不同。 – 2009-10-12 20:00:49