配置單元 - 相關值的總和

我正在使用AWS Athena過濾負載均衡器日誌。我創建了下表並將日誌導入到表中。配置單元 - 相關值的總和

CREATE EXTERNAL TABLE IF NOT EXISTS elb_logs (
    request_timestamp string, 
    elb_response_code string,  
    url string, 
    ) 

ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe' 
WITH SERDEPROPERTIES (
     'serialization.format' = '1','input.regex' = '([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ ]*)[:\-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) ([-0-9]*) ([-0-9]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" (\"[^\"]*\") ([A-Z0-9-]+) ([A-Za-z0-9.-]*)$') 
LOCATION 's3://athena-examples/elb/raw/';

現在我希望得到200 OK的計數，400次500的響應數。所以我執行了下面的查詢。

SELECT distinct(elb_response_code), 
     count(url) AS count 
FROM elb_logs 
GROUP BY elb_response_code

它的工作，但它會返回所有的響應，如下所示。

**response count** 
401 1270 
201 1369 
422 342 
200 3568727 
400 1221 
404 444 
304 10435 
413 3 
206 30 
500 1542

我要總結所有400,401,404,413,422和2XX同樣的事情，3XX和5XX所以結果應該是4XX總和（400,401,404,413,422）

**response count** 
4xx   52145 
2xx   1363224 
5xx   532

來源

2017-03-26 SQLadmin

假設所有代碼都長

3個字符

select  substr (elb_response_code,1,1) || 'xx' as elb_response_code_prefix 
      ,count(*)        as cnt 

from  elb_logs 

group by 1

這裏是更通用的解決方案

select  rpad (substr (elb_response_code,1,1),length(elb_response_code),'x') 
         as elb_response_code_prefix 
      ,count(*) as cnt 

from  elb_logs 

group by 1

來源

2017-03-26 17:27:21

謝謝，它的工作，這可能顯示與xx響應代碼值。像2xx，3xx在結果窗口？現在它顯示2,3,4,5。 – SQLadmin

查看已更新回答 –

真棒:)，感謝Dudu – SQLadmin

配置單元 - 相關值的總和

回答

相關問題