2016-02-01 93 views
1

我正在使用Postgres 9.4。我有一個JSONB場:PostgreSQL:如何彙總JSONB字段中的所有屬性?

 Column  │   Type   │        Modifiers 
─────────────────┼──────────────────────┼──────────────────────────────────────────────────────────────────── 
id    │ integer    │ not null default 
practice_id  │ character varying(6) │ not null 
date   │ date     │ not null 
pct_id   │ character varying(3) │ 
astro_pu_items │ double precision  │ not null 
astro_pu_cost │ double precision  │ not null 
star_pu   │ jsonb    │ 

我可以查詢JSONB場的原始值就好:

SELECT star_pu FROM mytable limit 1; 
star_pu │ {"statins_cost": 16790.692924903742, "hypnotics_adq": 18523.58385328709, "laxatives_cost": 8456.98405165182, "analgesics_cost": 48271.21822239242, "oral_nsaids_cost": 9911.336052088493, "antidepressants_adq": 186715.7, "antidepressants_cost": 26885.54622478343, "bronchodilators_cost": 26646.54899847902, "cox-2_inhibitors_cost": 2063.4652015406728, "antiplatelet_drugs_cost": 4844.798321177439, "drugs_for_dementia_cost": 3390.569564110721, "antiepileptic_drugs_cost": 44990.94756286502, "oral_antibacterials_cost": 21047.048353859234, "oral_antibacterials_item": 5096.6501798218205, "ulcer_healing_drugs_cost": 15999.05326260261, "lipid-regulating_drugs_cost": 24711.589440943662, "proton_pump_inhibitors_cost": 14545.398978447573, "inhaled_corticosteroids_cost": 50759.91062192373, "calcium-channel_blockers_cost": 11571.457036131978, "omega-3_fatty_acid_compounds_adq": 2026.0, "benzodiazepine_caps_and_tabs_cost": 1800.2581325567717, "bisphosphonates_and_other_drugs_cost": 2996.912924744617, "drugs_acting_on_benzodiazepine_receptors_cost": 2993.142806352308, "drugs_affecting_the_renin_angiotensin_system_cost": 20255.500615282508, "drugs_used_in_parkinsonism_and_related_disorders_cost": 9812.457888596877} 

現在我想SUM整個表JSONB值,但我不知道如何做到這一點。理想情況下,我會找回一本字典,其中的鍵如上所述,並且這些值是求和值。

我可以做以下SUM一個JSONB領域明確:

SELECT date, SUM(total_list_size) as total_list_size, 
    SUM((star_pu->>'oral_antibacterials_item')::float) AS star_pu_oral_antibac_items 
    FROM mytable GROUP BY date ORDER BY date 

但我怎麼計算的JSONB領域的所有屬性的總和 - 並優選返回整個字段作爲字典嗎?理想情況下我會回來的東西,如:

star_pu │ {"statins_cost": very-large-number, "hypnotics_adq": very-large-number, ... 

我想我可以通過顯式地總結每個按鍵手動獲取每個領域,但整體的原因,我有JSONB領域是有很多關鍵的,他們可能會改變。

它是安全的假設JSONB場只包含鍵和值,即具有深度1.

回答

2

查詢應做的工作:

select date, json_object_agg(key, val) 
from (
    select date, key, sum(value::numeric) val 
    from mytable t, jsonb_each_text(star_pu) 
    group by date, key 
    ) s 
group by date; 

生成的JSON值會按字母順序鍵進行排序(的json_object_agg()副作用)。我不知道這是否重要。

+0

這是偉大的,謝謝。不,排序很好。 – Richard

+0

我剛剛發佈了一個相關的問題:http://stackoverflow.com/questions/35130870/postgresql-how-to-sum-attributes-including-a-jsonb-field-and-retain-table-shap – Richard

0

有可能是一個更好的辦法,但至少這個工程:

WITH 
    keys AS (SELECT DISTINCT jsonb_object_keys(star_pu) AS key FROM mytable), 
    sums AS (SELECT key, sum((star_pu->>key)::float) AS total FROM keys, mytable GROUP BY key) 
    SELECT json_object(array_agg(key), array_agg(total::text))::jsonb FROM sums 

基本上它將jsonb分解爲行,從中獲取名稱,將它們彙總起來,聚合成數組並創建jsonb結構。不幸的是,沒有一個jsonb_object()函數,所以我們必須把它變成json,然後轉換成jsonb。

1

我寫了一個Postgres extension,確實如此。一旦你安裝它,你可以這樣做:

SELECT jsonb_deep_sum(star_pu) FROM mytable; 

基準是在4S 2萬行,@克林的答案需要11S