2016-12-01 35 views
2

我有許多天氣位置和風預報數據。我需要在前一天的10:00前最近的as_of。我需要每個小時,每一天,每個位置。獲取帶有多列組標識符的最新預測數據

一個位置被定義爲一個唯一的latlon對。與相關的樣本數據

全表模式:

CREATE SCHEMA weather 
CREATE TABLE weather.forecast 
    (
    foretime timestamp without time zone NOT NULL, 
    as_of timestamp without time zone NOT NULL, -- in UTC 
    summary text, 
    precipintensity numeric(8,4), 
    precipprob numeric(2,2), 
    temperature numeric(5,2), 
    apptemp numeric(5,2), 
    dewpoint numeric(5,2), 
    humidity numeric(2,2), 
    windspeed numeric(5,2), 
    windbearing numeric(4,1), 
    visibility numeric(5,2), 
    cloudcover numeric(4,2), 
    pressure numeric(6,2), 
    ozone numeric(5,2), 
    preciptype text, 
    lat numeric(8,6) NOT NULL, 
    lon numeric(9,6) NOT NULL, 
    CONSTRAINT forecast_pkey PRIMARY KEY (foretime, as_of, lat, lon) 
); 

INSERT INTO weather.forecast 
    (windspeed, foretime, as_of, lat, lon) 
VALUES 
    (11.19, '2/1/2016 8:00', '1/30/2016 23:00', 34.556, 28.345), 
    (10.98, '2/1/2016 8:00', '1/31/2016 5:00', 34.556, 28.345), 
    (10.64, '2/1/2016 8:00', '1/31/2016 11:00', 34.556, 28.345), 
    (10.95, '2/1/2016 8:00', '1/31/2016 8:00', 29.114, 16.277), 
    (10.39, '2/1/2016 8:00', '1/31/2016 23:00', 29.114, 16.277), 
    (9.22, '2/1/2016 8:00', '1/31/2016 5:00', 29.114, 16.277), 
    (10,  '2/1/2016 9:00', '1/30/2016 04:00', 34.556, 28.345), 
    (9.88, '2/1/2016 9:00', '1/31/2016 09:00', 34.556, 28.345), 
    (10.79, '2/1/2016 9:00', '1/30/2016 23:00', 34.556, 28.345), 
    (10.8, '2/1/2016 9:00', '1/31/2016 5:00', 29.114, 16.277), 
    (10.35, '2/1/2016 9:00', '1/31/2016 11:00', 29.114, 16.277), 
    (10.07, '2/1/2016 9:00', '1/31/2016 17:00', 29.114, 16.277) 
; 

期望的結果格式:

lat  lon  Foredate foreHE windspeed  as_of 
34.556 28.345 2/1/2016  8  10.98  1/31/2016 5:00 
34.556 28.345 2/1/2016  9  9.88  1/31/2016 9:00 
29.114 16.277 2/1/2016  8  10.95  1/31/2016 8:00 
29.114 16.277 2/1/2016  9  10.80  1/31/2016 5:00 

這裏是我的代碼以獲得正確的as_of。當我嘗試重新加入風速時,事情就變糟了。

SELECT   
    date_trunc('day', (a.foretime)) :: DATE AS Foredate, 
     extract(HOUR FROM (a.foretime)) AS foreHE, 
     a.lat, 
     a.lon, 
     max(a.as_of) - interval '5 hours' as latest_as_of 
FROM weather.forecast a 
WHERE date_trunc('day', foretime) :: DATE - as_of >= INTERVAL '14 hours' 
GROUP BY Foredate, foreHE, a.lat, a.lon 

回答

2

你的錯誤,加回風速時,是這樣的:

[42803] ERROR: column "a.windspeed" must appear in the GROUP BY clause or be used in an aggregate function 
    Position: 184 

我真的不能提高PostgreSQL的錯誤信息,也許除了要進理論位。基本上,當你做GROUP BY時,你可以讓自己承擔奢侈的行爲,代表更大的集合中的子集,這是由查詢的其餘部分表示的表格。但是SQL不允許你迭代這些子集,你必須告訴數據庫你的計算,並讓它返回另一個扁平列表。

在Postgres提出的兩個選項中,通常其中一個選項是明顯的選擇。例如,如果你遺漏了a.lon,很明顯你不是按照經度和緯度來分組的,你會同意它應該被添加到GROUP BY條款中。但在這種情況下,如果通過實際測量進行分組,則每個子集將只有一行,而這也是無用的。所以乍一看,你似乎需要一個聚合。第二個問題是,對於這個問題你沒有一個聚合。嘆!

所以這裏是我的想法。你需要查找主鍵是(用的名字,as_of,緯度,經度),你可以得到直接地與此查詢:

select 
    foretime, 
    max(as_of) as as_of, 
    lat, lon 
from weather.forecast 
group by foretime, lat, lon; 

現在你可以加入這個背到同一個表,forecast,以獲得最新的預測:

select 
    date_trunc('day', a.foretime)::date as forecast_day, 
    extract(hour from a.foretime) as forecast_hour, 
    a.lat, a.lon, 
    f.windspeed, 
    a.as_of - interval '5 hours' as latest_as_of 
from weather.forecast f 
join (select 
     foretime, 
     max(as_of) as as_of, 
     lat, lon 
     from weather.forecast 
     group by foretime, lat, lon) a using (foretime, as_of, lat, lon); 

這將產生以下報告:

forecast_day | forecast_hour | lat | lon | windspeed | latest_as_of 
--------------+---------------+-----------+-----------+-----------+--------------------- 
2016-02-01 |    8 | 34.556000 | 28.345000 |  10.64 | 2016-01-31 06:00:00 
2016-02-01 |    8 | 29.114000 | 16.277000 |  10.39 | 2016-01-31 18:00:00 
2016-02-01 |    9 | 34.556000 | 28.345000 |  9.88 | 2016-01-31 04:00:00 
2016-02-01 |    9 | 29.114000 | 16.277000 |  10.07 | 2016-01-31 12:00:00 
(4 rows) 

有可能是一個更有效的方式與相關子查詢做到這一點,但我米不知道如何完成它。

編輯:匹配輸出格式:

select 
    a.lat, a.lon, 
    date_trunc('day', a.foretime)::date as forecast_day, 
    extract(hour from a.foretime) as forecast_hour, 
    f.windspeed, 
    a.as_of - interval '5 hours' as latest_as_of 
from weather.forecast f 
    join (select 
      foretime, 
      max(as_of) as as_of, 
      lat, lon 
     from weather.forecast 
     where date_trunc('day', foretime)::date - as_of >= interval '14 hours' 
     group by foretime, lat, lon) a using (foretime, as_of, lat, lon) 
order by lat desc, lon; 

結果:

lat | lon | forecast_day | forecast_hour | windspeed | latest_as_of 
-----------+-----------+--------------+---------------+-----------+--------------------- 
34.556000 | 28.345000 | 2016-02-01 |    8 |  10.98 | 2016-01-31 00:00:00 
34.556000 | 28.345000 | 2016-02-01 |    9 |  9.88 | 2016-01-31 04:00:00 
29.114000 | 16.277000 | 2016-02-01 |    8 |  10.95 | 2016-01-31 03:00:00 
29.114000 | 16.277000 | 2016-02-01 |    9 |  10.80 | 2016-01-31 00:00:00 
(4 rows) 
+0

我不知道你是怎麼處理我的最新預測的約束在前一天上午10點的之前。其中一些最近的時間已經過了上午10點。這是我的查詢代碼中的WHERE條款的目的。 – otterdog2000

+0

@ otterdog2000我已經根據您的要求更改了它 –

+0

謝謝,當我將它與我的完整代碼結合使用時,我無法停下來。查詢只是運行直到我殺了它。 – otterdog2000