2013-06-28 43 views
2

BigQuery中,我們試圖運行:如何使此查詢有效運行?

SELECT day, AVG(value)/(1024*1024) FROM ( 
    SELECT value, UTC_USEC_TO_DAY(timestamp) as day, 
     PERCENTILE_RANK() OVER (PARTITION BY day ORDER BY value ASC) as rank 
    FROM [Datastore.PerformanceDatum] 
    WHERE type = "MemoryPerf" 
) WHERE rank >= 0.9 AND rank <= 0.91 
GROUP BY day 
ORDER BY day desc; 

返回的數據相對較少。但我們得到的消息:

Error: Resources exceeded during query execution. The query contained a GROUP BY operator, consider using GROUP EACH BY instead. For more details, please see https://developers.google.com/bigquery/docs/query-reference#groupby 

什麼使這個查詢失敗,子查詢的大小?是否有一些等價的查詢可以避免這個問題?


編輯迴應評論:如果我添加組分別由(落外ORDER BY),查詢失敗,聲稱GROUP分別由這裏不是並行。

+0

您是否嘗試過使用「GROUP EACH BY」作爲錯誤消息提示? – hexafraction

+0

如果我添加GROUP EACH BY(並刪除外部ORDER BY),則查詢失敗,聲稱GROUP EACH BY在這裏不可並行化。有什麼我失蹤? –

+1

添加到您的文章。我只是試圖幫助使其負責,不太可能被擱置「 – hexafraction

回答

1

我寫了對我的作品的等效查詢:

SELECT day, AVG(value)/(1024*1024) FROM (
SELECT data value, UTC_USEC_TO_DAY(dtimestamp) as day, 
     PERCENTILE_RANK() OVER (PARTITION BY day ORDER BY value ASC) as rank 
    FROM [io_sensor_data.moscone_io13] 
    WHERE sensortype = "humidity" 
) WHERE rank >= 0.9 AND rank <= 0.91 
GROUP BY day 
ORDER BY day desc; 

如果我只運行內部查詢,我得到3660624個結果。你的數據集比那個更大嗎?

當按天分組時,外部選擇僅給出4個結果。我會嘗試一個不同的分組,看看我是否可以達到極限:

SELECT day, AVG(value)/(1024*1024) FROM (
SELECT data value, dtimestamp/1000 as day, 
     PERCENTILE_RANK() OVER (PARTITION BY day ORDER BY value ASC) as rank 
    FROM [io_sensor_data.moscone_io13] 
    WHERE sensortype = "humidity" 
) WHERE rank >= 0.9 AND rank <= 0.91 
GROUP BY day 
ORDER BY day desc; 

也運行,現在有57,862個不同的組。

我嘗試了不同的組合來達到相同的錯誤。當您將初始數據量加倍時,我可以得到同樣的錯誤。一個簡單的「黑客」到的數據量翻番正在發生變化:

FROM [io_sensor_data.moscone_io13] 

要:

FROM [io_sensor_data.moscone_io13], [io_sensor_data.moscone_io13] 

然後我得到同樣的錯誤。你有多少數據?你能申請一個額外的過濾器嗎?由於您已經按天劃分了percentile_rank,您是否可以添加額外的查詢來僅分析一小部分日期(例如,僅上個月)?

+0

僅僅分析一小部分日子就是我現在正在做的黑客攻擊,但是由於實際返回的數據非常少,所以它有點讓我毛骨悚然。 –