2015-06-02 182 views
3

我有1個名爲錯誤它具有以下結構:優化SQL查詢連接

錯誤

| id | UserID  | CrashDump | ErrorCode| Timestamp 
| 1 | user1   | Crash 1  | 100  | 2015-04-08 21:00:00 
| 2 | user2   | Crash 2  | 102  | 2015-04-10 22:00:00 
| 3 | user3   | Crash 4  | 105  | 2015-05-08 12:00:00 
| 4 | user4   | Crash 4  | 105  | 2015-06-02 21:22:00 
| 5 | user4   | Crash 4  | 105  | 2015-06-03 04:16:00 

我希望得到如下數據結果集:

預期結果集

CrashDump  | Error Count| Affected Users| 
    Crash 4   | 3   | 2    | 
    Crash 2   | 1   | 1    | 
    Crash 1   | 1   | 1    | 

結果集會將每個錯誤的計數保存爲錯誤計數和受影響的用戶(接收到此錯誤的不同用戶)。

我已經能夠使用下面的查詢獲得期望的結果,但它已被證明是非常資源密集型的,並且在巨大的數據集MySQL崩潰。 您能否引導我如何優化我目前的查詢或指導我實現其邏輯的更好方法?任何幫助將不勝感激。

當前查詢:

select B.CrashDump as CrashDump, B.B_UID as affected users, C.C_UID as ErrorCount 
from 
(
    Select count(A.UserID) as B_UID, A.CrashDump, (A.timestamp) as timestmp, 
    (a.errorcode) as errorCde, (a.ID) as uniqueId 
    from 
    ( 
     select UserID , CrashDump, timestamp,errorcode,id 
     from errors 
     where Timestamp >='2015-04-08 21:00:00' and Timestamp <='2015-06-10 08:18:15' 
     group by userID,CrashDump 
    ) as A 
    group by A.CrashDump 
) as B 

left outer join 
(
    select CrashDump , count(UserID) as C_UID 
    from errors 
    where Timestamp >='2015-04-08 21:00:00' and Timestamp <='2015-06-10 08:18:15' 
    group by CrashDump 
) as C 

On B.CrashDump = C.CrashDump 

order by ErrorCount desc limit 0,10 
+0

你的問題是使用'GROUP BY'和['GROUP BY'聚合函數]解決的經典問題(http://dev.mysql.com/doc/refman/5.7/en/group-by-functions的.html)。這[回答](http://stackoverflow.com/a/30591063/4265352)顯示你的解決方案。 – axiac

回答

1

這是工作的解決方案:

Select A.CrashDump, sum(A.ErrorCount) as ErrorC, count(A.AffectedUsers) 
From 
(
SELECT 
    CrashDump, 
    COUNT(ErrorCode) AS ErrorCount, 
    COUNT(DISTINCT UserID) AS AffectedUsers, UserID 
FROM 
    errors 
WHERE 
    Timestamp >='2015-05-13 10:00:00' and Timestamp <='2015-05-14 03:07:00' 

GROUP BY 
    CrashDump, userID 
) AS A 
group by A.CrashDump 

order by ErrorC desc limit 0,10 

謝謝大家幫助實現期望的結果。

2

你就不能做到這一點?:

SELECT 
    CrashDump, 
    COUNT(ErrorCode) AS ErrorCount, 
    COUNT(DISTINCT UserID) AS AffectedUsers 
FROM 
    Errors 
WHERE 
    Timestamp >='2015-04-08 21:00:00' and Timestamp <='2015-06-10 08:18:15' 
GROUP BY 
    CrashDump 
+0

我已經嘗試了您在提供之前提供的查詢解決方案。這個實現面臨的問題是,查詢會帶來不一致的數據。結合查詢後,請檢查我的最終解決方案。 – Mubarak

3

嘗試

SELECT CrashDump, COUNT(ErrorCode) AS ErrorCount, COUNT(DISTINCT UserID) AS AffectedUser 
FROM errors 
WHERE Timestamp >='2015-04-08 21:00:00' AND Timestamp <='2015-06-10 08:18:15' 
GROUP BY CrashDump 
+0

執行COUNT(DISTINCT UserID),即使用戶有多次崩潰轉儲崩潰,也只能對用戶計數一次。 – jarlh

+0

是的,你是正確的 – tning

1
SELECT CrashDump, SUM(e) AS "Error Count", MAX(u) AS "Affected Users" 
FROM(
SELECT crashdump, count(errorcode) as e, count(userid) as u 
FROM errors 
WHERE Time_stamp BETWEEN '2015-04-08 21:00:00' and '2015-06-10 08:18:15' 
GROUP BY crashdump, userid) a 
GROUP BY crashdump 
ORDER BY crashdump DESC 

輸出

crashdump Error Count Affected Users 
Crash 4  3   2 
Crash 2  1   1 
Crash 1  1   1 

SQL FIDDLE:http://sqlfiddle.com/#!9/13eab/1/0

+0

謝謝你,我用你的查詢推導出最終的解決方案。乾杯! – Mubarak