2014-02-15 47 views
0

我有兩個表,lastfm_scrobbles和lastfm_annotations。示例數據:選擇mysql中兩個表的月度數據計數

mysql> select * from lastfm_scrobbles limit 5; 
+---------+---------+-----------+---------------------+ 
| user_id | item_id | artist_id | scrobble_time  | 
+---------+---------+-----------+---------------------+ 
| 1469 | 45651 |   1 | 2010-06-30 13:57:42 | 
| 1469 | 45651 |   1 | 2011-03-28 15:43:37 | 
| 6872 | 45653 |   1 | 2013-08-03 15:07:44 | 
| 7044 | 1370 |   1 | 2007-03-26 17:07:26 | 
| 7044 | 1370 |   1 | 2007-08-24 18:41:35 | 
+---------+---------+-----------+---------------------+ 

mysql> select * from lastfm_annotations limit 5; 
+---------+---------+-----------+--------+------------+ 
| user_id | item_id | artist_id | tag_id | tag_month | 
+---------+---------+-----------+--------+------------+ 
|  121 | 1330412 | 1330412 | 475 | 2006-12-01 | 
|  121 | 1330412 | 1330412 | 517 | 2006-12-01 | 
|  121 | 1330412 | 1330412 | 7280 | 2006-12-01 | 
|  121 | 1330412 | 1330412 | 21384 | 2006-12-01 | 
|  121 | 1330412 | 1330412 | 27872 | 2006-12-01 | 
+---------+---------+-----------+--------+------------+ 

此外,我有一個用戶信息表(lastfm_users)。這樣做的細節並不重要,但什麼是相關的是,查詢:

select user_id from lastfm_users where scrobbles_recorded==1; 

返回我關心的這個問題的目的用戶。

好的,用這個序言:我需要一個查詢,讓那些用戶查詢它們在每個月的記錄表和註釋表中的總條目數。換句話說,結果應該是這樣的:

user_id y  m  scrobble_count anno_count 
123  2006 3  100    50 
456  2008 11  321    10 
... and so on 

有意義嗎?我相信,我想查詢是以下組合:

select year(tag_month) as y, month(tag_month) as m, count(*) as anno_count 
    from lastfm_annotations where user_id in (select user_id from 
     lastfm_users where scrobbles_recorded=1) 
    group by user_id, year(tag_month), month(tag_month); 


select year(scrobble_time) as y, month(scrobble_time) as m, count(*) as scrobble_count 
    from lastfm_scrobbles where user_id in (select user_id from 
     lastfm_users where scrobbles_recorded=1) 
    group by user_id, year(scrobble_time), month(scrobble_time); 

但我不能確定的正確的方法來生成連接查詢得到我想要的結果。建議?

回答

0

您可以嘗試

select user_id, y, m, 
     coalesce(sum(case when source = 1 then total end), 0) anno_count, 
     coalesce(sum(case when source = 2 then total end), 0) scrobble_count 
    from 
(
    select 1 source, a.user_id, year(tag_month) y, month(tag_month) m, count(*) total 
    from lastfm_annotations a join lastfm_users u 
     on a.user_id = u.user_id 
    where u.scrobbles_recorded = 1 
    group by user_id, year(tag_month), month(tag_month) 
    union all 
    select 2 source, s.user_id, year(scrobble_time), month(scrobble_time), count(*) 
    from lastfm_scrobbles s join lastfm_users u 
     on s.user_id = u.user_id 
    where u.scrobbles_recorded = 1 
    group by user_id, year(scrobble_time), month(scrobble_time) 
) q 
group by user_id, y, m 

或只是

select user_id, y, m, 
     sum(case when source = 1 then 1 else 0 end) anno_count, 
     sum(case when source = 2 then 1 else 0 end) scrobble_count 
    from 
(
    select 1 source, a.user_id, year(tag_month) y, month(tag_month) m 
    from lastfm_annotations a join lastfm_users u 
     on a.user_id = u.user_id 
    where u.scrobbles_recorded = 1 
    union all 
    select 2 source, s.user_id, year(scrobble_time), month(scrobble_time) 
    from lastfm_scrobbles s join lastfm_users u 
     on s.user_id = u.user_id 
    where u.scrobbles_recorded = 1 
) q 
group by user_id, y, m; 

這裏是SQLFiddle演示

+0

謝謝!現在運行它們來驗證它們是否適用於我的數據,並且一旦完成就會接受。同時,您能否詳細介紹兩種方法?哪一個是首選的,爲什麼? – moustachio

+0

4個小時,查詢(第二個版本)仍在運行......我想知道如果在這一點上迭代執行此操作可能會更快。 – moustachio

+0

這最可能的原因是underindexing。運行'EXPLAIN ' – peterm