2015-07-13 35 views
1

我正在與SEDE合作創建投票與帖子比率的圖表。在去除掉所有實際的錯誤,我面臨着一個新的問題:由於某種原因,這個比例是總是 1.這是當前的SQL:爲什麼這個投票/帖子比例總是1?

SELECT CAST(p.CreationDate AS DATE) AS [CreationDate], 
     COUNT(CAST(v.CreationDate AS DATE))/COUNT(CAST(p.CreationDate AS DATE)) 
     AS [Ratio] 
FROM Posts p 
INNER JOIN Votes v ON v.PostId = p.Id 
WHERE v.VoteTypeId = ##VoteType:int?2## AND 
     p.PostTypeId = 1 OR p.PostTypeId = 2 
GROUP BY CAST(p.CreationDate AS DATE) 
ORDER BY Ratio 

查詢本身可以發現here

這是suggested in chat,這可能是因爲加入表格導致所有可能的組合,所以投票和帖子的數量總是相同的(因此n/n = 1)。這是否正確,如果是的話,我應該怎麼做呢?

+2

Bacause你的計數分組在p.CreationDate –

+0

@JoeTaras ...所以我應該分組在什麼上? – ArtOfCode

+0

當你選擇獨立過濾的兩組時,你會得到什麼?對於一些示例postID值,這應該非常清楚爲什麼你的'JOIN'沒有按預期工作。 –

回答

2

由於存在內部連接的兩側,COUNT(CAST(v.CreationDate AS DATE))COUNT(CAST(p.CreationDate AS DATE))將返回完全相同的數字,即組*中的行數。

如果你想算你有多少每股新崗位新票了在給定日期,使用COUNT(DISTINCT)

SELECT CAST(p.CreationDate AS DATE) AS [CreationDate], 
     COUNT(DISTINCT v.Id)/COUNT(DISTINCT p.Id) AS [Ratio] 
FROM Posts p 
INNER JOIN Votes v ON v.PostId = p.Id 
WHERE v.VoteTypeId = ##VoteType:int?2## AND 
     p.PostTypeId = 1 OR p.PostTypeId = 2 
GROUP BY CAST(p.CreationDate AS DATE) 
ORDER BY Ratio 

*假設CreationDate不能爲空。