目標:根據用戶的選擇推薦對象MySQL的:暗示對象(優化的多連接查詢)
數據:表包含在用戶將如何從最壞到訂單對象的子集信息最好;例如:
1 2 3 4 5 6
John: A B G J S O
Mary: A C G L
Joan: B C L J K
Stan: G J C L
用戶數大約是對象的20倍,每個用戶的陣容包含50-200個對象。
表:
CREATE TABLE IF NOT EXISTS `pref` (
`usr` int(10) unsigned NOT NULL,
`obj` int(10) unsigned NOT NULL,
`ord` int(10) unsigned NOT NULL,
UNIQUE KEY `u_o` (`usr`,`obj`),
KEY `u` (`usr`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
基本思想:從第二最壞開始用戶的對象內迭代,構建雙(A> B);根據這些用戶,在其他用戶的陣容中查找它們並列出比A更好的項目。
查詢:
SELECT e.obj, COUNT(e.obj) AS rate
FROM pref a, pref b, pref c, pref d, pref e
WHERE a.usr = '222' # step 1: select a pair of objects A, B, where A is better than B according to user X
AND a.obj = '111'
AND b.usr = a.usr
AND b.ord < a.ord
AND c.obj = a.obj # step 2: find users thinking that object A is better than B
AND d.obj = b.obj
AND d.ord < c.ord
AND d.usr = c.usr
AND e.ord > c.ord # step 3: find objects better than A according to these users
AND e.usr = c.usr
GROUP BY e.obj
ORDER BY rate DESC;
別名:
a
對象A( '111'),當前用戶( '222')
b
對象B,根據用戶當前的比A更壞(有'ord'的值比A低)
c
對象A在其他用戶的陣容中
d
對象B在其他用戶的陣容中
在其他用戶的陣容
執行計劃(OUO和UO作爲指標由Quassnoi的建議)比Ae
對象更好:
+----+-------------+-------+------+---------------+------+---------+---------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+---------------------+------+----------------------------------------------+
| 1 | SIMPLE | a | ref | ouo,uo | ouo | 8 | const,const | 1 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | b | ref | ouo,uo | uo | 4 | const | 86 | Using where |
| 1 | SIMPLE | d | ref | ouo,uo | ouo | 4 | db.b.obj | 587 | Using index |
| 1 | SIMPLE | c | ref | ouo,uo | ouo | 8 | const,db.d.usr | 1 | Using where; Using index |
| 1 | SIMPLE | e | ref | uo | uo | 4 | db.d.usr | 80 | Using where |
+----+-------------+-------+------+---------------+------+---------+---------------------+------+----------------------------------------------+
查詢,似乎只要精細工作的數據集不太大;關於如何簡化它以支持更大數據集的想法?
每個用戶平均會有多少個對象? – Quassnoi 2009-10-13 15:14:00
每個用戶大約有50-200個對象。 – Mike 2009-10-13 15:56:03
你真的需要這個排名嗎?如果不是那個排名,查詢可以很容易地改進。另外,你能否發佈查詢的執行計劃? – Quassnoi 2009-10-13 19:42:06