如何在PostgresSQL中使用Count作爲標準

我有一個現有的table1，其中包含「account」，「tax_year」和其他字段。當CONCAT（account，tax_year）的頻率爲1並符合WHERE子句時，我想創建一個table2與來自table1的記錄。如何在PostgresSQL中使用Count作爲標準

例如，如果table1的樣子如下：

account year 
aaa 2014 
bbb 2016 
bbb 2016 
ddd 2014 
ddd 2014 
ddd 2015

表2應該是：

account year 
aaa 2014 
ddd 2015

這裏是我的腳本：

DROP TABLE IF EXISTS table1; 
CREATE table2 AS 
SELECT 
    account::text, 
    tax_year::text, 
    building_number, 
    imprv_type, 
    building_style_code, 
    quality, 
    quality_description, 
    date_erected, 
    yr_remodel, 
    actual_area, 
    heat_area, 
    gross_area, 
    CONCAT(account, tax_year) AS unq 
FROM table1 
WHERE imprv_type=1001 and date_erected>0 and date_erected IS NOT NULL and quality IS NOT NULL and quality_description IS NOT NULL and yr_remodel>0 and yr_remodel IS NOT NULL and heat_area>0 and heat_area IS NOT NULL 
GROUP BY account, 
    tax_year, 
    building_number, 
    imprv_type, 
    building_style_code, 
    quality, 
    quality_description, 
    date_erected, 
    yr_remodel, 
    actual_area, 
    heat_area, 
    gross_area, 
    unq 
HAVING COUNT(unq)=1;

我花了兩天但它仍然無法弄清楚如何做對。謝謝您的幫助！

來源

2016-06-08 12B01

使用對(account, tax_year)的計數table1的正確方法：

select account, tax_year 
from table1 
where imprv_type=1001 -- and many more... 
group by account, tax_year 
having count(*) = 1;

所以你應該嘗試：

create table table2 as 
select * 
from table1 
where (account, tax_year) in (
    select account, tax_year 
    from table1 
    where imprv_type=1001 -- and many more... 
    group by account, tax_year 
    having count(*) = 1 
    );

來源

2016-06-08 21:04:17 klin

謝謝！我的源表中有11,755,200行和71行。該查詢已運行了20個小時，仍在運行。花費這麼長時間來分析這個數據集的大小是否很常見？我是Postgres的新手 – 12B01

這個查詢的確很昂貴。服務器很可能會耗盡內存，導致內存交換。隨着表格的大小，應該使用特殊的方法，例如。通過使用where子句將數據劃分爲更小的邏輯部分來分階段執行。 – klin

COUNT() = 1相當於NOT EXISTS(another with the same key fields)：

SELECT 
    account, tax_year 
    -- ... maybe more fields ... 
FROM table1 t1 
WHERE NOT EXISTS (SELECT * 
    FROM table1 nx 
    WHERE nx.account = t1.account -- same key field(s) 
    AND nx.tax_year = t1.tax_year 
    AND nx.ctid <> t1.ctid   -- but a different row! 
    );

注：I由複合匹配鍵取代了COUNT(CONCAT(account, tax_year)級聯密鑰字段。

來源

2016-06-08 19:47:07 wildplasser

謝謝你的快速回復！我認爲您的查詢將返回所有唯一記錄，不僅僅是頻率記錄（「帳戶」和「稅收年」）= 1。以我的問題中的table1爲例，NOT EXISTS將返回aaa 2014，bbb 2016，ddd 2014，ddd 2015.但我真正需要的只是aaa 2014和ddd2015 – 12B01

您可以將額外條件添加到where子句注意：你不需要**需要GROUP BY，因爲你不用這個方法使用聚合函數） – wildplasser

如何在PostgresSQL中使用Count作爲標準

回答

相關問題