2015-04-24 57 views
1

我想從基於兩列的表中返回一組唯一記錄以及最近的發佈時間和這些組合的總次數在他們的輸出記錄之前(及時)出現了兩列。爲SQL中的每個唯一組合列計算行

所以我想要得到的是沿着這些路線的東西:

select col1, col2, max_posted, count from T 
join (
select col1, col2, max(posted) as posted from T where groupid = "XXX" 
group by col1, col2) h 
on (T.col1 = h.col1 and 
    T.col2 = h.col2 and 
    T.max_posted = h.tposted) 
where T.groupid = 'XXX' 

計數需要是次發生col1和col2上的每個組合前max_posted輸出每個記錄的數量。 (我希望我解釋說,正確:)

編輯:在嘗試下面的建議爲:

select dx.*, 
    count(*) over (partition by dx.cicd9, dx.cdesc order by dx.tposted) as cnt 
from dx 
join (
select cicd9, cdesc, max(tposted) as tposted from dx where groupid ="XXX" 
group by cicd9, cdesc) h 
on (dx.cicd9 = h.cicd9 and 
    dx.cdesc = h.cdesc and 
    dx.tposted = h.tposted) 
where groupid = 'XXX'; 

伯爵始終返回「1」。此外,您如何計算tposted之前發生的記錄?

這也失敗了,但我希望你能得到在那裏我當家:

WITH H AS (
    SELECT cicd9, cdesc, max(tposted) as tposted from dx where groupid = 'XXX' 
    group by cicd9, cdesc), 
    J AS (
    SELECT count(*) as cnt 
    FROM dx, h 
    WHERE dx.cicd9 = h.cicd9 
     and dx.cdesc = h.cdesc 
     and dx.tposted <= h.tposted 
     and dx.groupid = 'XXX' 
) 
SELECT H.*,J.cnt 
FROM H,J 

幫助的人?

+2

樣本數據和期望的結果將有助於澄清問題。 –

回答

1

如何:

SELECT DISTINCT ON (cicd9, cdesc) cicd9, cdesc, 
    max(posted) OVER w AS last_post, 
    count(*) OVER w AS num_posts 
FROM dx 
WHERE groupid = 'XXX' 
WINDOW w AS (
    PARTITION BY cicd9, cdesc 
    RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING 
); 

由於缺乏PG版本,表定義,數據和所需的輸出,這是剛剛從臀部射擊,但原則應該工作:在製作分區在groupid = 'XXX'的兩列中找到posted列的最大值和窗口框架(因此在窗口定義中的RANGE...子句)中的行總數。

+0

很多在這裏學到的東西,如果這可以計算所有的行,它可能會工作..我需要研究它,當我沒有腦死亡。謝謝。 (PG版本9.3) –

+0

如何將查詢更改爲每cicd9,cdesc組只產生一行? (它目前在dx中找到的每一行重複相同的輸出行)。謝謝(我所有的新概念)。 –

+0

@AlanWayne:添加了一個'DISTINCT ON'子句,查看更新的答案。 – Patrick

0

你只是想要一個累計數?

select t.*, 
     count(*) over (partition by col1, col2 order by posted) as cnt 
from table t 
where groupid = 'xxx'; 
+0

是的。發佈之前只發生記錄的累計計數。請看我上面的編輯。謝謝。 –

+0

我可能會添加,我正在查找與輸出行匹配的行數(包括重複項)的總數。 –

+0

你的問題仍然含糊不清。您應該添加示例數據和期望的結果。 –

0

這是我能想到的最好的 - 更好的建議,歡迎!

這將產生我需要數將永遠是至少1(從加入)的結果,但有一項諒解:

SELECT dx.cicd9, dx.cdesc, max(dx.tposted), count(*) 
from dx 
join (
SELECT cicd9, cdesc, max(tposted) as tposted from dx where groupid = 'XXX' 
    group by cicd9, cdesc) h 
on 
    (dx.cicd9 = h.cicd9 and dx.cdesc = h.cdesc and dx.tposted <= h.tposted 
    and dx.groupid = 'XXX') 
group by dx.cicd9, dx.cdesc 
order by dx.cdesc; 

WITH H AS (
    SELECT cicd9, cdesc, max(tposted) as tposted from dx where groupid = 'XXX' 
    group by cicd9, cdesc) 
SELECT dx.cicd9, dx.cdesc, max(dx.tposted), count(*) 
from dx, H 
where dx.cicd9 = h.cicd9 and dx.cdesc = h.cdesc and dx.tposted <= h.tposted 
    and dx.groupid = 'XXX' 
group by dx.cicd9, dx.cdesc 
order by cdesc; 
+0

它是否會返回您需要或不?如果不是,它是如何失敗的?如果您無法用單詞澄清任務,您應該在問題中添加有意義的示例值和所需的結果*。 –

+0

@ErwinBrandstetter是(差不多)。如果count = 0,會更好,因爲我希望先前日期的行數有所增加。否則,它運作良好。 –

0

這是令人困惑:

計數需要的次數每個組合的col1和 col2發生在max_posted之前輸出中的每條記錄。

由於根據定義,記錄是「之前」(或在同一時間作爲)最新帖子,這基本上意味着每組合的總計數(忽略假定斷接一個錯誤在句子中)。

所以這個燒燬的簡單GROUP BY

SELECT cicd9, cdesc 
    , max(posted) AS last_posted 
    , count(*) AS ct 
FROM dx 
WHERE groupid = 'XXX' 
GROUP BY 1, 2 
ORDER BY 1, 2; 

這不正是一樣,目前接受的答案。只是更快,更簡單。