2010-06-17 39 views
1

我正在玩一個MySQL實例中的Lahman Baseball Database。我想找到每年擊敗本壘打(HR)的球員。擊球表有其模式的以下(相關部分):我如何找到每年排名前N的麪糊?

+-----------+----------------------+------+-----+---------+-------+ 
| Field  | Type     | Null | Key | Default | Extra | 
+-----------+----------------------+------+-----+---------+-------+ 
| playerID | varchar(9)   | NO | PRI |   |  | 
| yearID | smallint(4) unsigned | NO | PRI | 0  |  | 
| HR  | smallint(3) unsigned | YES |  | NULL |  | 
+-----------+----------------------+------+-----+---------+-------+ 

每一年,每一個球員都有一個條目(數百每年12K之間,可以追溯到1871年)。獲得前N個打者爲一個單一的一年很簡單:

SELECT playerID,yearID,HR 
FROM Batting 
WHERE yearID=2009 
ORDER BY HR DESC LIMIT 3; 
+-----------+--------+------+ 
| playerID | yearID | HR | 
+-----------+--------+------+ 
| pujolal01 | 2009 | 47 | 
| fieldpr01 | 2009 | 46 | 
| howarry01 | 2009 | 45 | 
+-----------+--------+------+ 

但我感興趣的是從發現前3每年。我已經找到了解決方案,如this,描述如何選擇從類別上,我已經嘗試將其應用到我的問題,只有永遠不會返回查詢到結束:

SELECT 
    b.yearID, b.playerID, b.HR 
FROM 
    Batting AS b 
LEFT JOIN 
    Batting b2 
    ON 
    (b.yearID=b2.yearID AND b.HR <= b2.HR) 
GROUP BY b.yearID HAVING COUNT(*) <= 3; 

我在哪裏出錯了?

+3

您需要先排序PED使用(降序)。 – MusiGenesis 2010-06-17 16:50:42

回答

3

像這樣的東西應該工作

SELECT b.playerID, b.yearID, b.HR 
FROM Batting b 
WHERE HR >= (
    SELECT b2.HR 
    FROM Batting b2 
    WHERE b2.yearID=b1.yearID 
    ORDER BY b2.HR DESC 
    LIMIT 2, 1 
) 
ORDER BY b.yearID DESC, b.HR DESC; 

說明:選擇具有> =號本壘打作爲第三大當年都行。這不會打破關係。所以如果有不止一個擊球手有相同數量的本壘打,他們都會出現。

結果是從最近一年開始排序,按每年排名排序。

注意:LIMIT是基於0的偏移量,因此2,1意味着在第二行抓取一行後開始,即:第三行。

+0

加1爲LIMIT解釋。 – 2010-06-17 17:10:57

+0

LIMIT參數實際上是另一種方式:偏移量,然後是行數。另外,你在子查詢中有一個錯誤 - 'b1'應該只是'b'。除此之外,這是正確的。花了4點18分才發現*自2005年以來的結果(Macbook Pro,OS X 10.6.3,Core 2 2.5GHz,足夠的RAM以便將所有數據存儲在內存中),因此可能會進行一些優化。 – 2010-06-17 17:29:53

+0

感謝您指出限制詳情。我編輯了我的答案。 查詢緩存實際上是否設置爲足以容納內存中的所有內容?這麼慢的原因是Batting中的每一行都是這樣,它會執行相當昂貴的子查詢。 可能的優化是在yearID上添加索引並在HR上添加另一個索引。進一步的優化將是建立一個臨時表,每年保留一個臨時表,並與人口數第三高的人進行比較。 – 2010-06-17 17:43:36

0

哇,隨機。我碰巧在模擬Oracle分析功能時使用article對拉赫曼棒球數據庫進行同樣的查詢(對於工資)。這個版本的查詢很快,但並不那麼直觀。

select * 
from (

select 
    b.yearID as year, 
    b.teamID as team, 
    m.nameFirst as first, 
    m.nameLast as last, 
    find_in_set(b.HR, x.teamRank) as rank, 
    b.HR as HR 


from 
    Batting b 
    inner join Master m on m.playerID = b.playerID 
    inner join (select yearID, group_concat(distinct HR order by HR desc) as teamRank from Batting group by yearID) x on x.yearID = b.yearID 

) x 

where 
    rank <= 10 and rank > 0 

order by  
    year desc, rank 

或每隊前5 HR總計2010年度...

select * 
from (

select 
    b.yearID as year, 
    b.teamID as team, 
    m.nameFirst as first, 
    m.nameLast as last, 
    b.HR as HR, 
    find_in_set(b.HR, x.teamRank) as rank 

from 
    Batting b 
    inner join Master m on m.playerID = b.playerID 
    inner join (select teamID, group_concat(distinct HR order by HR desc) as teamRank from Batting where yearID = 2010 group by teamID) x on x.teamID = b.teamID 
where 
    b.yearID = 2010 
) x 

where 
    rank <= 5 and rank > 0 

order by  
    team, rank 

limit 12 

顯示這些結果...

year team first last  HR rank 
2010 ARI Mark Reynolds 32 1 
2010 ARI Chris Young  27 2 
2010 ARI Kelly Johnson  26 3 
2010 ARI Adam LaRoche  25 4 
2010 ARI Justin Upton  17 5 
2010 ATL Brian McCann  21 1 
2010 ATL Jason Heyward  18 2 
2010 ATL Troy Glaus  16 3 
2010 ATL Martin Prado  15 4 
2010 ATL Eric Hinske  11 5 
2010 BAL Luke Scott  27 1 
2010 BAL Ty  Wigginton 22 2 
相關問題