2016-11-21 94 views
1

我有一個小問題。 數據:計數分組colums並通過其他列將它們分組

 
2016-11-09 0536B088-D3DE-4C0E-903F-C2463D0AAB7E 
2016-11-09 866D70EC-93FD-4C30-BC54-C7B954F255BE 
2016-11-09 6C090D6B-9842-4CB0-9E10-F9B941C8D3A1 
2016-11-09 FB1DD63E-F098-4191-B8F4-BEA4F9776B54 
2016-11-09 FB1DD63E-F098-4191-B8F4-BEA4F9776B54 
2016-11-10 0536B088-D3DE-4C0E-903F-C2463D0AAB7E 
2016-11-10 NULL 
2016-11-10 0536B088-D3DE-4C0E-903F-C2463D0AAB7E 
2016-11-11 0536B088-D3DE-4C0E-903F-C2463D0AAB7E 
2016-11-11 0536B088-D3DE-4C0E-903F-C2463D0AAB7E 

從中我想通過日期來計算用戶名和組。 我應該是這樣的:

 
Date | Unique | Returning | New 
..09 | 4  | 1   | 3 
..10 | 2  | 1   | 1 
..11 | 1  | 1   | 0 

我該怎麼做呢? 我有這個查詢。

select 
    cast(EventTime as date) as 'Date', 
    count(distinct UserId) + count(distinct case when UserId is null then 1 end) as 'Unique users', 
    0 as 'Returning users', 
    0 as 'New users' 
from 
    TelemetryData 
where 
    DiscountId = '5F8851DD-DF77-46DC-885E-46ECA93F021C' and EventName = 'DiscountClick' 
group by 
    cast(EventTime as date)` 

唯一用戶=唯一與NULL太!

Returing用戶=用戶ID誰點擊了1倍以上isnull(sum(case when UserId(here shoudld be count) > 1 then 1 else 0 end), 1)

唯一一個誰點擊新用戶! isnull(sum(case when UserId(count also) = 1 then 1 else 0 end), 1)

@EDIT: 好的,你的兩個結果工作完美。但我現在需要將其與其他查詢集成。 SELECT '5F8851DD-DF77-46DC-885E-46ECA93F021C', cast([dbo].[TelemetryData].[EventTime] as date) as 'Date', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountLike' then 1 else 0 end) as 'Likes', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountDislike' then 1 else 0 end) as 'Dis likes', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountSharing' then 1 else 0 end) as 'Shares', SUM(case when [dbo].[TelemetryData].[EventName]='DiscountView' then 1 else 0 end) as 'Views', SUM(case when [dbo].[TelemetryData].[EventName]='DiscountClick' then 1 else 0 end) as 'Clicks', Sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountCode' then 1 else 0 end) as 'Downloaded codes', Sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountSave' then 1 else 0 end) as 'Saves', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountClickWWW' then 1 else 0 end) as 'Page redirections', Round( cast(Sum(case when [dbo].[TelemetryData].[EventName]='DiscountClick' then 1 else 0 end) as float) / cast( case when SUM(case when [dbo].[TelemetryData].[EventName]='DiscountView' then 1 else 0 end) = 0 then 1 else SUM(case when [dbo].[TelemetryData].[EventName]='DiscountView' then 1 else 0 end) end as float) * 100, 2) as 'Average CTR', 0 as 'Unique users', 0 as 'New users', 0 as 'Returning users', Sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountCommentPositive' then 1 else 0 end) as 'Positive comments', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountCommentNegative' then 1 else 0 end) as 'Negative comments' from [dbo].[TelemetryData] where [dbo].[TelemetryData].[DiscountId] = '5F8851DD-DF77-46DC-885E-46ECA93F021C' and ([dbo].[TelemetryData].[EventName] = 'DiscountView' or [dbo].[TelemetryData].[EventName] = 'DiscountClick' or [dbo].[TelemetryData].[EventName] = 'DiscountDislike' or [dbo].[TelemetryData].[EventName] = 'DiscountCode' or [dbo].[TelemetryData].[EventName] = 'DiscountLike' or [dbo].[TelemetryData].[EventName] = 'DiscountSharing' or [dbo].[TelemetryData].[EventName] = 'DiscountClickWWW' or [dbo].[TelemetryData].[EventName] = 'DiscountSave' or [dbo].[TelemetryData].[EventName] = 'DiscountCommentPositive' or [dbo].[TelemetryData].[EventName] = 'DiscountCommentNegative') group by cast([dbo].[TelemetryData].[EventTime] as date) order by cast([dbo].[TelemetryData].[EventTime] as date) asc

現在,這將是很難......

+0

您的示例數據有兩列(哪些?),但您的查詢至少引用了3列。 – jarlh

+0

您正在使用哪些DBMS? –

+0

數據有更多的列,但我需要olny在這兩個操作。數據和用戶ID。 – Nerf

回答

1

您希望在結果中彙總用戶信息。一個顯而易見的簡單解決方案是按日期和用戶優先進行分組,以便按用戶和日期獲取此信息,並且僅在以後按日期分組。

select 
    eventdate, 
    count(*) as unique_users, 
    count(case when cnt > 1 then 1 end) as returning_users, 
    count(case when cnt = 1 then 1 end) as new_users 
from 
(
    select cast(eventtime as date) as eventdate, userid, count(*) as cnt 
    from telemetrydata 
    where ... 
    group by cast(eventtime as date), userid 
) date_user 
group by eventdate; 
+0

不工作。 Msg 156,Level 15,State 1,Line 44 關鍵字'unique'附近的語法不正確。 消息102,級別15,狀態1,行53 「date_user」附近的語法不正確。 – Nerf

+0

好的。這教導我們:不要對列,表和別名使用保留字:-)因爲這是SQL Server使用'[unique]'。 (在更符合標準的DBMS中,您將使用''unique''代替)。或者使用其他名稱,比如'unique_users'。 –

+0

對。效果很好。謝謝。 – Nerf

0

可能是我不明白你的問題,但希望您的數據看來,你需要

select 
    date 
    , count(*) as unique 
    , (count(*) - count(distinct user_id)) as returning 
    , count(distinct user_id) as new 

group by date 
were user_id is not null 
0

嘗試下面的查詢

select Date, uniques, returning, uniques-returning as new 
from ( 
    select Date, 
      sum(case when row_num = 1 then 1 else 0 end) uniques, 
      sum(case when row_num = 2 then 1 else 0 end) returning 
    from( 
     select cast(EventTime as date) as Date, 
       ROW_NUMBER() over(partition by EventTime, userid order by EventTime) row_num 
     from TelemetryData) cte1  
    group by Date)cte2 

希望這能幫到你

+0

錯誤:消息164,級別15,狀態1,行51 每個GROUP BY表達式都必須包含至少一個不是外部引用的列。 消息8155,級別16,狀態2,行51 'cte2'的列1未指定任何列名稱。 – Nerf

+0

更新了查詢以解決您的錯誤。 – Viki888

+0

Nah結果是錯誤的。 – Nerf

0

試試這個使用公用表表達式:

設置

CREATE TABLE #TelemetryData 
(
    EventTime Date, 
    UserId UNIQUEIDENTIFIER NULL 
    ) 


INSERT INTO #TelemetryData 
VALUES 
('2016-11-09', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E'), 
('2016-11-09', '866D70EC-93FD-4C30-BC54-C7B954F255BE'), 
('2016-11-09', '6C090D6B-9842-4CB0-9E10-F9B941C8D3A1'), 
('2016-11-09', 'FB1DD63E-F098-4191-B8F4-BEA4F9776B54'), 
('2016-11-09', 'FB1DD63E-F098-4191-B8F4-BEA4F9776B54'), 
('2016-11-10', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E'), 
('2016-11-10', NULL), 
('2016-11-10', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E'), 
('2016-11-11', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E'), 
('2016-11-11', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E') 

查詢

;WITH CTE 
AS 
(
    SELECT EventTime, 
      UserId, 
      COUNT(*) cnt, 
      ROW_NUMBER() OVER (PARTITION BY EventTime ORDER BY EventTime) RN 
    FROM #TelemetryData 
    GROUP BY EventTime, UserId 
) 

SELECT EventTime, 
     MAX(RN) AS [Unique], 
     SUM(CASE WHEN cnt > 1 THEN 1 ELSE 0 END) as New, 
     SUM(CASE WHEN cnt = 1 THEN 1 ELSE 0 END) AS Returning 
FROM CTE 
GROUP BY EventTime 

結果

EventTime Unique New Returning 
2016-11-09 4  1 3 
2016-11-10 2  1 1 
2016-11-11 1  1 0 
0

以下查詢應該工作:

select EventTime, 
    max(DistinctRank) [Unique], 
    sum(CountOfDistinct - 1) Returning, 
    max(DistinctRank) - sum(CountOfDistinct - 1) New 
from 
    (select distinct EventTime, 
     UserId, 
     rank() over (partition by EventTime order by UserId) DistinctRank, 
     count(1) over (partition by EventTime, UserId) CountOfDistinct 
    from TelemetryData) sub 
group by EventTime 

子查詢(單獨運行它,看看自己)將返回EVENTTIME和用戶ID的獨特組合,每一個獨特的用戶ID的等級一起對於給定的日期,以及不同值的EVENTTIME和用戶ID的每個組合的計:

EventDate    UserId        DistinctRank CountOfDistinct 
2016-11-09 00:00:00.000 0536B088-D3DE-4C0E-903F-C2463D0AAB7E 1    1 
2016-11-09 00:00:00.000 6C090D6B-9842-4CB0-9E10-F9B941C8D3A1 2    1 
2016-11-09 00:00:00.000 866D70EC-93FD-4C30-BC54-C7B954F255BE 3    1 
2016-11-09 00:00:00.000 FB1DD63E-F098-4191-B8F4-BEA4F9776B54 4    2 
2016-11-10 00:00:00.000 NULL         1    1 
2016-11-10 00:00:00.000 0536B088-D3DE-4C0E-903F-C2463D0AAB7E 2    2 
2016-11-11 00:00:00.000 0536B088-D3DE-4C0E-903F-C2463D0AAB7E 1    2 

然後外部查詢獲取最大DistinctRank每個uniq的ue對,它是EventDate的唯一UserIds數量,本質上是子查詢記錄的總和,其中UserId中對於給定EventDate是重複用戶的數量。新欄只是唯一和返回之間的區別。結果是:

Event Date    Unique Returning New 
2016-11-09 00:00:00.000 4  1   3 
2016-11-10 00:00:00.000 2  1   1 
2016-11-11 00:00:00.000 1  1   0