2015-05-06 36 views
1

我有一個大表(〜6M行,41周的cols)在PostgreSQL如下:什麼是比較有效:幾個插入VS單個插入與工會

id | answer1 | answer2 | answer3 | ... | answer40 
1 | xxx  | yyy  | null | ... | null 
2 | xxx  | null | null | ... | null 
3 | xxx  | null | zzz  | ... | aaa 

注意,有很多空列在每行我只希望那些數據

我希望它正常化得到這個:

id | answers 
1 | xxx 
1 | yyy 
2 | xxx 
3 | xxx 
3 | zzz 
... 
3 | aaa 

的問題是,什麼是更有效/快,幾個刀片或單個刀片和許多工會?:

選項1

create new_table as 
select id, answer1 from my_table where answer1 is not null 
union 
select id, answer2 from my_table where answer2 is not null 
union 
select id, answer3 from my_table where answer3 is not null 
union ... 

選項2

create new_table as select id, answer1 from my_table where answer1 is not null; 
insert into new_table select id, answer2 from my_table where answer2 is not null; 
insert into new_table select id, answer3 from my_table where answer3 is not null; 
... 

方案3:有沒有更好的方式來做到這一點?

回答

1

選項2應該更快。

將所有語句包裝在begin-commit塊中以節省單個提交的時間。

爲了更快地選擇確保被過濾的列(如where answer1 is not null)具有指標

+0

THX!它運行速度快:) – ArKano