避免循環計算運行平均值/統計數據列

我想加快以下（PostgreSQL）代碼，並且我推測它可以幫助擺脫（某些）循環，但是我沒有看到方法來做到這一點。歡迎任何關於加速的建議。提前致謝！避免循環計算運行平均值/統計數據列

該代碼爲不同部分的每個列計算一些統計量（平均值，斜率）。該部分由滑動時間窗（例如60分鐘）確定。因此，通過這些我很感興趣，計算其統計

爲每列不同的列如下

循環的代碼，我依次移動我的時間窗口，並計算在該窗口中的值的統計信息。

for col_name in ..... a list of column names 
truncate small_table;   -- where statistics are temporarily stored 
for cur in select time from big_table loop 
    execute 'select regr_slope('|| col_name ||', time) as slope,' 
     || ' avg(' || col_name || ') as mean' 
     || ' from big_table where' 
     || ' time <=' || cur.time 
     || ' and time >=' || cur.time-60 
     into result; 

    execute 'insert into small_table values($1,$2,$3)' 
     using cur.time, result.slope, result.mean; 
end loop; 

execute 'update big_table set ' 
    || col_name || '_slope = small_table.slope, ' 
    || col_name || '_mean = small_table.mean ' 
    || ' where big_table.time=small_table.time'; 
end loop;

small_table，其中結果被暫時儲存，引入避免對big_table多個更新。

實際上有相當多的列（約50），這可能放緩的另一個因素？

來源

2013-08-30 Mu W

是否BIG_TABLE對時間列的索引？ – Laurence

是的。感謝澄清。 –

數據點之間是否存在固定的時間間隔或是隨機的？ – Laurence

如果您動態生成以下SQL模式，您至少可以在一個查詢中執行所有這些操作。我不確定它是否會有更好的表現，但（顯然你需要遍歷所有列並添加它們）。在擔心在代碼中構建SQL之前，我會測試性能。

Update 
    big_table b 
Set 
    field1_slope = x.field1_slope, 
    field1_mean = x.field1_mean, 
    field2_slope = x.field2_slope, 
    field2_mean = x.field2_mean 
From (
    Select 
     b1.time, 
     regr_slope(b2.field1, b2.time) field1_slope, 
     avg(b2.field1) field1_mean, 
     regr_slope(b2.field2, b2.time) field2_slope, 
     avg(b2.field2) field2_mean 
    From 
     big_table b1 
      Inner Join 
     big_table b2 
      On b2.time >= b1.time and b2.time < b1.time + 60 
    Group By 
     b1.time 
    ) x 
Where 
    b.time = x.time;

我對PostgreSQL不太熟悉，可能有辦法消除對大表的引用之一。

Example SQL Fiddle

Another way with cursors

來源

2013-08-30 22:22:16 Laurence

這很好地工作 - 內部連接，分組和顯式分配的組合。謝謝！ –