2012-10-21 60 views
1

我有一個足球池網站。每個星期,我的朋友挑選每場比賽的勝者。我想比較每個玩家對其他玩家的選擇並列出相似的百分比。我發現這個頁面幫助我計算特定周的相似度:Compare group of tags to find similarity/score with PHP/MySQL。榮譽Ivar Bonsaksen,他的解決方案效果很好!比較幾周內與其他球員的足球比賽

我現在想要做的是顯示過去幾周每個玩家的累計相似度

我有3個表來查詢:配置文件(spprofiles),遊戲(sp6games)和精選(sp6picks)。另一張名爲「團隊」(sp6teams)的表格用於獲取團隊的名稱,但在這裏無關緊要。

Profiles (spprofiles) 
+-----------+-------------+ 
| profileID | profilename | 
+-----------+-------------+ 
| 52  | My Team A | 
| 53  | Some Team B | 
+-----------+-------------+ 

Games (sp6games) 
+--------+--------+---------+------+ 
| gameID | weekID | visitor | home | 
+--------+--------+---------+------+ 
| 1  | 2  | 9  | 21 | 
| 2  | 2  | 14  | 6 | 
| 17  | 3  | 6  | 9 | 
| 18  | 3  | 30  | 21 | 
+--------+--------+---------+------+ 

Picks (sp6picks) 
+-----------+--------+------+ 
| profileID | gameID | pick | 
+-----------+--------+------+ 
| 52  | 1  | 21 | 
| 52  | 2  | 6 | 
| 52  | 17  | 12 | 
| 52  | 18  | 21 | 
| 53  | 1  | 9 | 
| 53  | 2  | 6 | 
| 53  | 17  | 9 | 
| 53  | 18  | 21 | 
+-----------+--------+------+ 

當前一週的查詢看起來是這樣的:

$weekID = 3; //the current weekID 
$profile = 52; //the current ProfileID 

SELECT 
    targetProfiles.profileID AS targetID, 
    sourceProfiles.profileID AS sourceID, 
    COUNT(targetProfiles.profileID) 
    /
    (((SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = sourceProfiles.profileID AND weekID = $weekID) 
     + 
    (SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = targetProfiles.profileID AND weekID = $weekID))/2) 
    AS similarity 
FROM 
    spProfiles AS sourceProfiles 
    LEFT JOIN 
    (SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID = $weekID) AS sourcePicks 
    ON (sourcePicks.profileID = sourceProfiles.profileID) 
    INNER JOIN 
    (SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID = $weekID) AS targetPicks 
    ON (sourcePicks.pick = targetPicks.pick AND sourcePicks.profileID != targetPicks.profileID) 
    LEFT JOIN 
    spProfiles AS targetProfiles 
    ON (targetPicks.profileID = targetProfiles.profileID) 
WHERE sourceProfiles.profileID = $profile 
GROUP BY targetID 

如果我單獨運行在幾周此查詢我得到如下結果:

$weekID = 2; 
+----------+----------+------------+ 
| targetID | sourceID | similarity | 
+----------+----------+------------+ 
| 53  | 52  | 0.5000  | 
+----------+----------+------------+ 

$weekID = 3; 
+----------+----------+------------+ 
| targetID | sourceID | similarity | 
+----------+----------+------------+ 
| 53  | 52  | 0.5000  | 
+----------+----------+------------+ 

查詢我到目前爲止,累計看起來像這樣(但我嘗試了其他變化)。基本上,我只是將WHERE子句更改爲包含前幾周weekID <= $weekID,並將遊戲表添加到主FROM子句LEFT JOIN sp6games ON (targetPicks.gameID = sp6games.gameID)

$weekID = 3; //the current weekID 
$profile = 52; //the current ProfileID 

SELECT 
    targetProfiles.profileID AS targetID, 
    sourceProfiles.profileID AS sourceID, 
    COUNT(targetProfiles.profileID) 
    /
    (((SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = sourceProfiles.profileID AND weekID <= $weekID) 
     + 
    (SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = targetProfiles.profileID AND weekID <= $weekID))/2) 
    AS similarity 
FROM 
    spProfiles AS sourceProfiles 
    LEFT JOIN 
    (SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID <= $weekID) AS sourcePicks 
    ON (sourcePicks.profileID = sourceProfiles.profileID) 
    INNER JOIN 
    (SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID <= $weekID) AS targetPicks 
    ON (sourcePicks.pick = targetPicks.pick AND sourcePicks.profileID != targetPicks.profileID) 
    LEFT JOIN 
    spProfiles AS targetProfiles 
    ON (targetPicks.profileID = targetProfiles.profileID) 
    LEFT JOIN sp6games ON (targetPicks.gameID = sp6games.gameID) 
WHERE sourceProfiles.profileID = $profile 
GROUP BY targetID, weekID 

合併的結果應該是0.5000,而是我得到:

$weekID = 3; 
+----------+----------+------------+ 
| targetID | sourceID | similarity | 
+----------+----------+------------+ 
| 53  | 52  | 0.7500  | 
+----------+----------+------------+ 

問題是COUNT(targetProfiles.profileID)沒有跨周累計正確,因此​​值搞砸。對於較大的數據集,它似乎也不是很有效。

感謝您花時間閱讀,並可能有所幫助。

回答

2
SELECT t.profileID     AS target, 
     SUM(s.pick=t.pick)/COUNT(*) AS similarity 
FROM  sp6picks s 
    JOIN sp6picks t USING (gameID) 
    JOIN sp6games g USING (gameID) 
WHERE g.weekID <= 3 
    AND s.profileID != t.profileID 
    AND s.profileID = 52 
GROUP BY t.profileID 

請參閱sqlfiddle

+0

哇,謝謝。所以看似簡單。當解決方案花費的時間比問題少時,真的很糟糕! – Itlan

+0

我該如何修改這個以獲得每個玩家對單週玩家的結果?刪除'AND s.profileID = 52'只會返回每個人1個結果。 – Itlan

+0

@Itlan:你需要對兩個玩家進行分組:SELECT s.profileID AS source,t.profileID AS target,SUM(s.pick = t.pick)/ COUNT(*)AS sp6picks的相似度s JOIN sp6picks t USING (gameID)JOIN sp6games g USING(gameID)WHERE g.weekID = 3 AND s.profileID!= t.profileID GROUP BY s.profileID,t.profileID'。請注意,如果您不希望/需要相同的結果出現在兩個方向上(例如對於2對1和1對2),則可以將不等式過濾器從's.profileID!= t.profileID'更改爲's .profileID eggyal