我想這個蜂巢查詢蜂巢多個計數(有和沒有DISTINCT)產生不好的輸出
Select id,count(distinct CASE WHEN unix_timestamp(m_date) BETWEEN unix_timestamp(cast(date_sub(cast('2017-02-01' as date),60) as date)) AND unix_timestamp(cast('2017-02-01' as date)) THEN m_date ELSE 0 END)
,count(CASE WHEN unix_timestamp(m_date) BETWEEN unix_timestamp(cast(date_sub(cast('2017-02-01' as date),60) as date)) AND unix_timestamp(cast('2017-02-01' as date)) THEN m_date ELSE 0 END)
From DB.TABLE2 GROUP BY id limit 10;
,這讓我smthg喜歡:
111007001007633 1 1
111007001029793 1 1
111007001000521 1 11
111007001000794 1 1
111007001000273 3 13
111007001001032 1 1
111007001025874 1 4
111007001001792 1 7
111007001029181 1 1
111007001000141 16 96
但當我添加其他數:
Select id,count(distinct CASE WHEN unix_timestamp(m_date) BETWEEN unix_timestamp(cast(date_sub(cast('2017-02-01' as date),60) as date)) AND unix_timestamp(cast('2017-02-01' as date)) THEN m_date ELSE 0 END)
,count(CASE WHEN unix_timestamp(m_date) BETWEEN unix_timestamp(cast(date_sub(cast('2017-02-01' as date),60) as date)) AND unix_timestamp(cast('2017-02-01' as date)) THEN m_date ELSE 0 END)
,count(distinct CASE WHEN unix_timestamp(m_date) BETWEEN unix_timestamp(cast(date_sub(cast('2017-02-01' as date),15) as date)) AND unix_timestamp(cast('2017-02-01' as date)) THEN m_date ELSE 0 END)
,count(CASE WHEN unix_timestamp(m_date) BETWEEN unix_timestamp(cast(date_sub(cast('2017-02-01' as date),15) as date)) AND unix_timestamp(cast('2017-02-01' as date)) THEN m_date ELSE 0 END)
From DB.TABLE2 GROUP BY id limit 10;
它返回的東西不像這樣:
111007001010439 0 0 1 0
111007001026963 0 0 1 0
111007001028001 0 0 1 0
111007001032987 0 0 1 0
111007001048710 0 0 1 0
111007001052415 0 0 1 0
111007002008374 0 0 1 0
111007003000644 0 0 1 0
111007003002210 0 0 1 0
我在hadoop集羣上工作,如果它是由map reduce造成的,我不會。
由於
[編輯]
正如我回答到@pashaz評論,第一個問題是由兩個相同的查詢,其用於不同的和0得到1的結果(有和沒有不同)因爲不明顯。
第二個問題是兩個不同查詢和兩個非重複查詢之間的結果。如果您檢查時間戳,則會看到第一個查詢包含秒數,因爲前兩個數字計算「2017-02-01」和之間60天之間的出現次數,第二次計算「2017-02-01」和15天前。
[更新]
如果我把一個WHERE子句,它的工作原理
WHERE id="111007001007633" OR id="271011604404359" OR id="122213250512607" OR id="111007001033217"
111007001033217 0 0 0 0 0 0
122213250512607 1 3 8 14 0 0
271011604404359 12 21 26 42 5 9
111007001007633 14 19 24 34 5 5
LIMIT子句似乎是問題。
是第二個查詢返回每個單行的結果(0,0,1,0)?如果對第一個查詢返回「有效」結果的行之一運行第二個查詢,會發生什麼情況,如111007001000141? – Andrew
@Andrew我不知道,我會檢查並給你結果ASAP –