2016-07-06 82 views
0

我有一個表,看起來像下面:跨多個日期蜂巢窗口功能範圍

TagName | DateTime   | Value 

TagName1|2016-07-06 09:49:34|14 
TagName1|2016-07-06 09:50:34|15 
TagName1|2016-07-06 09:51:34|18 
TagName2|2016-07-03 02:13:34|421 
TagName2|2016-07-03 03:13:34|422 
TagName3|2016-07-01 03:13:34|14 

我所希望做的是多聚集的這個表中的每個標籤名(如和,加權平均,最新值,計數等)。

這是我到目前爲止有:

SELECT * 
FROM 
(
SELECT 
t1.TagName, 
reflect("java.util.UUID", "randomUUID") as rv_id, 
t2.item_id as rs_id, 
from_unixtime(unix_timestamp()) as tstamp, 
t1.datetime as last_date, 
t1.value as last_value, 
t1.minimum as minimum, 
t1.maximum as maximum, 
t1.count as count, 
t1.total as total, 
t1.average as average, 
SUM(t1.weight_value) OVER (PARTITION BY TagName) as weighted_average, 
t1.Rank as Rank 
FROM 
(SELECT 
TagName, 
value, 
datetime, 
MIN(value) OVER (PARTITION BY TagName) as minimum, 
MAX(value) OVER (PARTITION BY TagName) as maximum, 
ROW_NUMBER() OVER (PARTITION BY TagName ORDER BY datetime DESC) as Rank, 
SUM(value) OVER (PARTITION BY TagName) as total, 
COUNT(value) OVER (PARTITION BY TagName) as count, 
AVG(value) OVER (PARTITION BY TagName) as average, 
(unix_timestamp(datetime) - LAG(unix_timestamp(datetime),1) OVER (PARTITION BY TagName ORDER BY datetime))/ 
(SUM(unix_timestamp(datetime) - LAG(unix_timestamp(datetime),1) OVER (PARTITION BY TagName ORDER BY datetime)) OVER (PARTITION BY TagName)) * 
(LAG(value,1) OVER (PARTITION BY TagName ORDER BY datetime)) as weight_value 
FROM raw.analog_history_dynamic 
WHERE par_date > date_format(date_sub(to_date(current_date), 5),'yyyyMMdd')) t1 
LEFT JOIN meta.item_meta t2 
ON t1.TagName = t2.name) t3 
WHERE t3.Rank =1; 

在這種情況下,我看最近5天

WHERE par_date > date_format(date_sub(to_date(current_date), 5),'yyyyMMdd')) 

除了短短的5天,我還有其他10米範圍內我有一些其他範圍我需要計算:

-- 1min 
WHERE par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 60000; 

-- 5Min 
WHERE par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 300000; 

-- 10 Min 
WHERE par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 600000; 

-- 30 Min 
WHERE par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 1800000; 

-- 1 Month 
WHERE par_date > date_format(date_sub(to_date(current_date), 30),'yyyyMMdd'); 

-- 2 Month 
WHERE par_date > date_format(date_sub(to_date(current_date), 60),'yyyyMMdd'); 

至少我想我想結合上同一分區下ES所以所有的<1天聚集體(按日期分區表)

於能夠一個查詢之內的所有這些計算結合,而不是與一個不同的,其中條件單獨執行每一個任何意見或建議。

感謝

回答

0
In the select query statement only you could use "case when condition;s" which you have given in where clause eg - 

SELECT * 
FROM 
(
SELECT 
t1.TagName, 
reflect("java.util.UUID", "randomUUID") as rv_id, 
t2.item_id as rs_id, 
from_unixtime(unix_timestamp()) as tstamp, 
t1.datetime as last_date, 
t1.value as last_value, 
t1.flag, 
t1.minimum as minimum, 
t1.maximum as maximum, 
t1.count as count, 
t1.total as total, 
t1.average as average, 
SUM(t1.weight_value) OVER (PARTITION BY TagName) as weighted_average, 
t1.Rank as Rank 
FROM 
(SELECT 
TagName, 
value, 
datetime, 
case 
when par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 60000 
then flag_1min 
when par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 300000 
then flag_5min 
when .......and so on 
end as flag, 
MIN(value) OVER (PARTITION BY TagName) as minimum, 
MAX(value) OVER (PARTITION BY TagName) as maximum, 
ROW_NUMBER() OVER (PARTITION BY TagName ORDER BY datetime DESC) as Rank, 
SUM(value) OVER (PARTITION BY TagName) as total, 
COUNT(value) OVER (PARTITION BY TagName) as count, 
AVG(value) OVER (PARTITION BY TagName) as average, 
(unix_timestamp(datetime) - LAG(unix_timestamp(datetime),1) OVER (PARTITION BY TagName ORDER BY datetime))/ 
(SUM(unix_timestamp(datetime) - LAG(unix_timestamp(datetime),1) OVER (PARTITION BY TagName ORDER BY datetime)) OVER (PARTITION BY TagName)) * 
(LAG(value,1) OVER (PARTITION BY TagName ORDER BY datetime)) as weight_value 
FROM raw.analog_history_dynamic 
WHERE par_date > date_format(date_sub(to_date(current_date), 5),'yyyyMMdd')) t1 
LEFT JOIN meta.item_meta t2 
ON t1.TagName = t2.name 
group by TagName, 
value, 
datetime, 
case 
when par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 60000 
then flag_1min 
when par_date > date_format(date_sub(to_date(current_date), 1),'yyyyMMdd') 
and unix_timestamp(datetime) > unix_timestamp(current_timestamp) - 300000 
then flag_5min 
when .......and so on 
end as flag,) t3 
WHERE t3.Rank =1; 

NOTE: in the above code of yours, you have forgotten to use GROUP BY function since you had aggregate functions 
+0

我不認爲,因爲我所有的聚集GROUP BY需要的是OVER PARTITION哪些羣體本身。當您嘗試按原始分組時,它會引發錯誤 – scrayon