我不能完全測試此人,但也許這會爲你工作:
WITH data AS(
SELECT STRUCT('1' as user_id) user_dim, ARRAY< STRUCT<date string, name string, timestamp_micros INT64> > [('20170610', 'EVENT1', 1497088800000000), ('20170610', 'LOGIN_CALL', 1498088800000000), ('20170610', 'LOGIN_CALL_OK', 1498888800000000), ('20170610', 'EVENT2', 159788800000000), ('20170610', 'LOGIN_CALL', 1599088800000000), ('20170610', 'LOGIN_CALL_OK', 1608888800000000)] event_dim union all
SELECT STRUCT('2' as user_id) user_dim, ARRAY< STRUCT<date string, name string, timestamp_micros INT64> > [('20170610', 'EVENT1', 1497688500400000), ('20170610', 'LOGIN_CALL', 1497788800000000)] event_dim UNION ALL
SELECT STRUCT('3' as user_id) user_dim, ARRAY< STRUCT<date string, name string, timestamp_micros INT64> > [('20170610', 'EVENT1', 1487688500400000), ('20170610', 'LOGIN_CALL', 1487788845000000), ('20170610', 'LOGIN_CALL_OK', 1498888807700000)] event_dim
)
SELECT
AVG(time_diff) avg_time_diff
FROM(
SELECT
CASE WHEN e.name = 'LOGIN_CALL' AND LEAD(NAME,1) OVER(PARTITION BY user_dim.user_id ORDER BY timestamp_micros ASC) = 'LOGIN_CALL_OK' THEN TIMESTAMP_DIFF(TIMESTAMP_MICROS(LEAD(TIMESTAMP_MICROS, 1) OVER(PARTITION BY user_dim.user_id ORDER BY timestamp_micros ASC)), TIMESTAMP_MICROS(TIMESTAMP_MICROS), day) END time_diff
FROM data,
UNNEST(event_dim) e
WHERE e.name in ('LOGIN_CALL', 'LOGIN_CALL_OK')
)
我模擬3個用戶與您在Firebase Schema具有相同的架構。
基本上,我首先應用了UNNEST
操作,以使每個值爲event_dim.name
。然後應用篩選器以僅獲取您感興趣的事件,即「LOGIN_CALL」和「LOGIN_CALL_OK」。
正如上面沒啥評論,你需要有一定的識別這些行作爲否則你不會知道哪個事件成功這所以這就是爲什麼的分析功能分區取user_dim.user_id
作爲輸入,以及。
在此之後,它只是TIMESTAMP操作,以獲得差異在適當的時候(當領先的事件是「LOGIN_CALL_OK」和當前一個是「LOGIN_CALL」然後走差異化,這是在CASE表達式來表示)。
您可以在TIMESTAMP_DIFF函數中選擇要分析日期的哪一部分,如秒,分鐘,日等。
您需要有某種ID才能連接LOGIN_CALL和LOGIN_CALL_OK以在它們之間做有意義的區別。 –
不確定關注?在BigQuery DB中,這些都被打包成單行......事件的名稱將會是我相信的ID? –
對不起,我想我現在有點進一步了 - 表中有一個ID,我只是沒有包括在上面。 –