2011-09-21 103 views
0

我在我的數據庫中四個不同表:左連接 - >聚集函數問題

螺紋:

  • THREAD_ID
  • thread_content
  • 時間戳

thread_rating:

  • thread_rating_id
  • THREAD_ID
  • 喜歡
  • 不喜歡

thread_report:

  • thread_report_id
  • THREAD_ID

thread_impression:

  • thread_impression_id
  • THREAD_ID

我要去參加這些表與此SQL查詢

SELECT t.thread_id, 
t.thread_content, 
SUM(tra.liked) AS liked, 
SUM(tra.disliked) AS disliked, 
t.timestamp, 
((100*(tra.liked + SUM(tra.liked)))/(tra.liked + SUM(tra.liked) + (tra.disliked + SUM(tra.disliked)))) AS liked_percent, 
((100*(COUNT(DISTINCT tre.thread_report_id))/((COUNT(DISTINCT ti.thread_impression_id))))) AS reported_percent 
FROM thread AS t 
LEFT JOIN thread_rating AS tra ON t.thread_id = tra.thread_id 
LEFT JOIN thread_report AS tre ON tra.thread_id = tre.thread_id 
LEFT JOIN thread_impression AS ti ON tre.thread_id = ti.thread_id 
GROUP BY t.thread_id 
ORDER BY liked_percent 

查詢應返回所有thread_id,其中包含計算出的喜歡和不喜歡的內容,喜歡的百分比,時間戳,線程插入數據庫時​​的百分比和報表百分比(時間,線程顯示給用戶)...

幾乎所有的結果都是正確的,唯一不正確的結果是喜歡和不喜歡。

如果我在查詢前面加上一個計數(*),我可以看到,正確的結果有1個計數,錯誤的有時計數達到60個。 似乎有交叉加入問題...

我認爲這是一個問題與分組,或者我應該擁抱聯接。

我見過使用子選擇的解決方案。但我不認爲這是一個很好的解決方案,這個問題...

我在做什麼錯在這裏?

+0

請指出查詢應返回的內容。如果可能,請包括一個簡單的示例。這將使其他人更好地回答你的問題。 – Martijn

回答

2

tra表中每個thread_id有多個記錄。這導致SUM函數中的雙重計數。
在子查詢中進行求和,按連接字段分組。
這樣你將只有一個tra2中的thread_id加入並且重複行將被避免。

SELECT t.thread_id, 
    t.thread_content, 
    tra2.liked 
    tra2.disliked, 
    t.timestamp, 
    tra2.liked_percent, 
    ((100*(COUNT(DISTINCT tre.thread_report_id))/((COUNT(DISTINCT ti.thread_impression_id))))) AS reported_percent 
FROM thread AS t 
LEFT JOIN (
    SELECT 
     tra.thread_id 
     , SUM(tra.liked) AS liked 
     , SUM(tra.disliked) AS disliked 
     , ((100*(tra.liked + SUM(tra.liked)))/(tra.liked + SUM(tra.liked) + (tra.disliked + SUM(tra.disliked)))) AS liked_percent 
    FROM thread_rating AS tra 
    GROUP BY tra.thread_id 
) as tra2 ON t.thread_id = tra2.thread_id 
LEFT JOIN thread_report AS tre ON tra.thread_id = tre.thread_id 
LEFT JOIN thread_impression AS ti ON tre.thread_id = ti.thread_id 
GROUP BY t.thread_id 
ORDER BY liked_percent DESC 
+0

非常感謝!有用! – pmuens

+0

那麼你應該接受Johan的答案。 – ddevienne