2013-08-04 28 views
-2

我正在試圖在表中的下列數據集上找到CHI-SQUARE TEST。我想我的這個查詢找到了卡方檢驗:Chi-SQUARE TEST的SQL查詢

SELECT sessionnumber, sessioncount, timespent, 
(dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as expected 
FROM (SELECT sessionnumber, SUM(cast(cnt as bigint)) as cnt 
FROM d3 
GROUP BY sessionnumber) dim1 CROSS JOIN 
(SELECT sessioncount, SUM(cast(cnt as bigint)) as cnt 
FROM d3 
GROUP BY sessioncount) dim2 CROSS JOIN 
(SELECT timespent, SUM(cast(cnt as bigint)) as cnt 
FROM d3 
GROUP BY timespent) dim3 CROSS JOIN 
(SELECT SUM(cast(cnt as bigint)) as cnt FROM d3) dimall 

的樣本數據是:

sessionnumber sessioncount timespent  cnt 
1     17    28   45 
2     22    8   30 
3     1    1   2 
4     1    1   2 
5     8    111   119 
6     8    65   73 
7     11    5   16 
8     1    1   2 
9     62    64   126 
10     6    42   48 

但它給我的卡方檢驗錯誤的輸出值,它給人的輸出是:

sessionnumber sessioncount timespent expected 
1     23    1   0 
2     23    1   0 
3     23    1   0 
4     23    1   0 
5     23    1   0 
6     23    1   0 
7     23    1   0 
8     23    1   0 
9     23    1   0 
10     23    1   0 

我已經盡力了,並且搜索了很多關於這個問題。請幫我一個忙,並善意解決問題!提前致謝!

回答

2

的整數運算,投dimall.cnt爲十進制或數字或做以下

/(dimall.cnt* 1.00)* (dimall.cnt * 1.00) 

另一個例子來解釋到底發生了什麼

select 3/2 -- output = 1, integer math, result is an integer 

select 3/2.00 -- output = 1.50 
+0

那麼應該怎麼解決? –

+0

我給你答案'將dimall.cnt轉換爲十進制或數字,或者將每列乘以1.00,如我的回答 – SQLMenace

+0

ok所示,但仍然給我錯誤的輸出我怎麼檢查輸出是100%正確的? –

2

因爲你已經在做石膏在你的計算,你也可以投到float而不是bigint

SELECT sessionnumber, sessioncount, timespent, 
(dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as expected 
FROM (SELECT sessionnumber, SUM(cast(cnt as float)) as cnt 
FROM d3 
GROUP BY sessionnumber) dim1 CROSS JOIN 
(SELECT sessioncount, SUM(cast(cnt as float)) as cnt 
FROM d3 
GROUP BY sessioncount) dim2 CROSS JOIN 
(SELECT timespent, SUM(cast(cnt as float)) as cnt 
FROM d3 
GROUP BY timespent) dim3 CROSS JOIN 
(SELECT SUM(cast(cnt as float)) as cnt FROM d3) dimall; 

float有16位精度的數字,所以它應該足以計算已知宇宙中任何合理數量的對象。