2017-08-01 31 views
2

我需要幫助制定一個隊列/保留查詢隊列/保留查詢導出的數據

我想建立一個查詢,看看誰在他們的第一次訪問進行ActionX遊客(在時間幀),然後多少天以後,他們回到了執行操作X再次

我(最終)需要看起來像這樣的輸出...

screen

我處理的表是一個出口的Google分析到大查詢

如果有人可以幫我解決這個問題,或者任何寫過類似我可以操縱的查詢的人?

感謝

+0

你好,歡迎來到Stack Overflow,請花點時間瀏覽[Welcome tour](https:// stackoverflow。com/tour)瞭解你在這裏的方式(也可以獲得你的第一張徽章),閱讀如何創建[mcve]示例並檢查[問],以便增加獲得反饋和有用答案的機會。 – garfbradaz

回答

2

只給你簡單的想法/方向

下面是BigQuery的標準SQL

#standardSQL 
SELECT 
    Date_of_action_first_taken, 
    ROUND(100 * later_1_day/Visits) AS later_1_day, 
    ROUND(100 * later_2_days/Visits) AS later_2_days, 
    ROUND(100 * later_3_days/Visits) AS later_3_days 
FROM `OutputFromQuery` 

你可以從你的問題

與測試下方的虛擬數據
#standardSQL 
WITH `OutputFromQuery` AS (
    SELECT '01.07.17' AS Date_of_action_first_taken, 1000 AS Visits, 800 AS later_1_day, 400 AS later_2_days, 300 AS later_3_days UNION ALL 
    SELECT '02.07.17', 1000, 860, 780, 860 UNION ALL 
    SELECT '29.07.17', 1000, 780, 120, 0 UNION ALL 
    SELECT '30.07.17', 1000, 710, 0, 0 
) 
SELECT 
    Date_of_action_first_taken, 
    ROUND(100 * later_1_day/Visits) AS later_1_day, 
    ROUND(100 * later_2_days/Visits) AS later_2_days, 
    ROUND(100 * later_3_days/Visits) AS later_3_days 
FROM `OutputFromQuery` 

Th ËOutputFromQuery數據如下:

Date_of_action_first_taken Visits later_1_day later_2_days later_3_days 
01.07.17     1000 800   400    300 
02.07.17     1000 860   780    860 
29.07.17     1000 780   120    0  
30.07.17     1000 710   0    0  

,並最終輸出爲:

Date_of_action_first_taken later_1_day later_2_days later_3_days  
01.07.17     80.0  40.0   30.0  
02.07.17     90.0  78.0   86.0  
29.07.17     80.0  12.0   0.0 
30.07.17     70.0  0.0    0.0 
+0

謝謝米哈伊爾! 這有助於給我一種風味。 如果你能檢查出來,讓我知道你的想法,我已經把我的查詢(或者我已經在上面)。 謝謝你的迴應! – Shaz

+0

當然。考慮投票答案,如果它幫助你。我將盡快在PC上檢查您的版本 –

1

所以我想我可能已經破解了......從這個輸出,然後我需要對其進行操作(樞表),使其看起來像所需的輸出。

任何人都可以檢查一下我,讓我知道你在想什麼?

`WITH 
cohort_items AS (
SELECT 
MIN(TIMESTAMP_TRUNC(TIMESTAMP_MICROS((visitStartTime*1000000 + 
h.time*1000)), DAY)) AS cohort_day, fullVisitorID 
FROM 
TABLE123 AS U, 
UNNEST(hits) AS h 
WHERE _TABLE_SUFFIX BETWEEN "20170701" AND "20170731" 
AND 'ACTION TAKEN' 
GROUP BY 2 
), 


user_activites AS (
SELECT 
A.fullVisitorID, 
DATE_DIFF(DATE(TIMESTAMP_TRUNC(TIMESTAMP_MICROS((visitStartTime*1000000 + h.time*1000)), DAY)), DATE(C.cohort_day), DAY) AS day_number 
FROM `Table123` A 

LEFT JOIN cohort_items C ON A.fullVisitorID = C.fullVisitorID, 
UNNEST(hits) AS h 

WHERE 
A._TABLE_SUFFIX BETWEEN "20170701 AND "20170731" 

AND 'ACTION TAKEN' 
GROUP BY 1,2), 

cohort_size AS (
SELECT 
cohort_day, 
count(1) as number_of_users 
FROM 
cohort_items 
GROUP BY 1 
ORDER BY 1 
), 

retention_table AS (
SELECT 
C.cohort_day, 
A.day_number, 
COUNT(1) AS number_of_users 
FROM 
user_activites A 

LEFT JOIN cohort_items C ON A.fullVisitorID = C.fullVisitorID 
GROUP BY 1,2 
) 


SELECT 
B.cohort_day, 
S.number_of_users as total_users, 
B.day_number, 
B.number_of_users/S.number_of_users as percentage 
FROM retention_table B 

LEFT JOIN cohort_size S ON B.cohort_day = S.cohort_day 

WHERE B.cohort_day IS NOT NULL 
ORDER BY 1, 3 
` 

預先感謝您!

1

如果您使用BigQuery中提供的一些技術,您可以通過成本和性能有效的解決方案解決這類問題。舉個例子:

SELECT 
    init_date, 
    ARRAY((SELECT AS STRUCT days, freq, ROUND(freq * 100/MAX(freq) OVER(), 2) FROM UNNEST(data) ORDER BY days)) data 
FROM(
    SELECT 
    init_date, 
    ARRAY_AGG(STRUCT(days, freq)) data 
FROM(
    SELECT 
    init_date, 
    data AS days, 
    COUNT(data) freq 
FROM(
    SELECT 
    init_date, 
    ARRAY(SELECT DATE_DIFF(PARSE_DATE("%Y%m%d", dts), PARSE_DATE("%Y%m%d", init_date), DAY) AS dt FROM UNNEST(dts) dts) data 
    FROM(
    SELECT 
     MIN(date) init_date, 
     ARRAY_AGG(DISTINCT date) dts 
    FROM `Table123` 
    WHERE TRUE 
    AND EXISTS(SELECT 1 FROM UNNEST(hits) where eventinfo.eventCategory = 'recommendation') -- This is your 'ACTION TAKEN' filter 
    AND _TABLE_SUFFIX BETWEEN "20170724" AND "20170731" 
    GROUP BY fullvisitorid 
    ) 
    ), 
    UNNEST(data) data 
    GROUP BY init_date, days 
    ) 
    GROUP BY init_date 
) 

我測試對我們G.A數據和選擇的客戶誰與我們的推薦系統交互此查詢(你可以在過濾器中選擇WHERE EXISTS...看到)。結果的例子(省略頻率的絕對值出於保護隱私):

enter image description here

正如你所看到的,在一天28日,例如,客戶的8%回來了一天後,再次與系統交互。

我建議你玩這個查詢,看看它是否適合你。它更簡單,更便宜,速度更快,希望更容易維護。