2016-11-21 40 views
0

我試圖從Google Analytics數據的大查詢表中查詢每個來源的訪問總數,但需要在着陸頁級別過濾一些會話。因此,我預先查詢通過登陸頁面和重新加入到會話數據,像這樣visitIDs:加入到着陸頁查詢會使每個來源的會話數翻倍

#StandardSQL 
WITH landingpages AS (
    SELECT 
    visitID, 
    h.page.pagePath AS LandingPage 
    FROM 
    `project.dataset.ga_sessions_*`, UNNEST(hits) AS h 
    WHERE 
    hitNumber = 1 
    AND 
    _TABLE_SUFFIX BETWEEN '20150926' AND '20150926' 
    # filters to be added here 
) 

SELECT 
    sessions.trafficSource.source, 
    SUM(sessions.totals.visits) AS visits 
FROM `project.dataset.ga_sessions_*` AS sessions 

JOIN 
    landingpages 
ON 
    landingpages.visitID = sessions.visitID 
WHERE 
    _TABLE_SUFFIX BETWEEN '20150926' AND '20150926' 
GROUP BY 
    trafficSource.source 
ORDER BY 
    visits DESC 

這大致每增加一倍從GA報告的每個源會話的數量。

任何人都可以指出我做錯了什麼? (我懷疑這是非常明顯的)

我已經嘗試檢查從第一個查詢的數據輸出,並找不到任何錯誤,除了一小部分重複的visitIDs。我也嘗試過各種不同類型的JOIN,現在都可以使用。

+0

我能看到我已經錯了,我忘了跟隨我已經給出的建議http://stackoverflow.com/questions/39894328 /選擇收益每着陸頁換嵌套表使用 - 谷歌 - 大查詢。它並沒有完全滲透到獨立訪問由fullVisitorID和visitID表示,並且需要雙連接。正在努力驗證這一點。 – goose

回答

1

當從GBQ查詢ga數據時,必須知道並記住,唯一訪問由fullVisitorID和visitID表示。兩者只有雙連接纔會返回有意義的數據集。

這是我應該寫:

#StandardSQL 
WITH landingpages AS (
    SELECT 
    fullVisitorId, 
    visitID, 
    h.page.pagePath AS LandingPage 
    FROM 
    `project.dataset.ga_sessions_*`, UNNEST(hits) AS h 
    WHERE 
    hitNumber = 1 
    AND 
    _TABLE_SUFFIX BETWEEN '20150926' AND '20150926' 

), 
session_data AS (
    SELECT 
     date AS ga_date, trafficSource.source AS source, fullVisitorId, visitID, SUM(totals.visits) AS visits 
    FROM 
     `project.dataset.ga_sessions_*` 
    WHERE 
     _TABLE_SUFFIX BETWEEN '20150926' AND '20150926' 
    AND 
     totals.visits > 0  
    GROUP BY ga_date, source, fullVisitorId, visitID 
) 

SELECT 
    ga_date, source, SUM(visits) AS Sessions 
FROM 
    landingpages 
JOIN 
    session_data 
ON 
    landingpages.VisitID = session_data.VisitID 
AND 
    landingpages.fullVisitorId = session_data.fullVisitorId 
GROUP BY 
    ga_date, source 
ORDER BY 
    Sessions DESC