2013-10-22 49 views

回答

37

從該演示使用查詢:http://www.slideshare.net/AmazonWebServices/amazon-redshift-best-practices

分析磁盤空間使用情況集羣:

select 
    trim(pgdb.datname) as Database, 
    trim(pgn.nspname) as Schema, 
    trim(a.name) as Table, 
    b.mbytes, 
    a.rows 
from (
    select db_id, id, name, sum(rows) as rows 
    from stv_tbl_perm a 
    group by db_id, id, name 
) as a 
join pg_class as pgc on pgc.oid = a.id 
join pg_namespace as pgn on pgn.oid = pgc.relnamespace 
join pg_database as pgdb on pgdb.oid = a.db_id 
join (
    select tbl, count(*) as mbytes 
    from stv_blocklist 
    group by tbl 
) b on a.id = b.tbl 
order by mbytes desc, a.db_id, a.name; 

分析節點之間的表分配:一個模式過濾器

select slice, col, num_values, minvalue, maxvalue 
from svv_diskusage 
where name = '__INSERT__TABLE__NAME__HERE__' and col = 0 
order by slice, col; 
+0

嘿,檢查我的評論,你的查詢存在潛在的錯誤。 – Diego

1

添加所有者和到以上查詢:

select 
cast(use.usename as varchar(50)) as owner, 
trim(pgdb.datname) as Database, 
trim(pgn.nspname) as Schema, 
trim(a.name) as Table, 
b.mbytes, 
a.rows 
from 
(select 
    db_id, 
    id, 
    name, 
    sum(rows) as rows 
    from stv_tbl_perm a 
    group by db_id, id, name 
) as a 
join pg_class as pgc on pgc.oid = a.id 
left join pg_user use on (pgc.relowner = use.usesysid) 
join pg_namespace as pgn on pgn.oid = pgc.relnamespace 
    -- leave out system schemas 
    and pgn.nspowner > 1 
join pg_database as pgdb on pgdb.oid = a.db_id 
join 
    (select 
    tbl, 
    count as mbytes 
    from stv_blocklist 
    group by tbl 
) b on a.id = b.tbl 
order by mbytes desc, a.db_id, a.name; 
+0

count應該是count(*) –

+0

或count(blocknum) – rohitkulky

0

剛想過我會擴大這個,因爲我面臨着一個不均勻分佈的問題。我添加了一些鏈接和字段,以便按節點和切片分析空間。同時添加的最大/最小值,並且每片價值數0列

select 
cast(use.usename as varchar(50)) as owner, 
trim(pgdb.datname) as Database, 
trim(pgn.nspname) as Schema, 
trim(a.name) as Table, 
a.node, 
a.slice, 
b.mbytes, 
a.rows, 
a.num_values, 
a.minvalue, 
a.maxvalue 
from 
(select 
    a.db_id, 
    a.id, 
    s.node, 
    s.slice, 
    a.name, 
    d.num_values, 
    d.minvalue, 
    d.maxvalue, 
    sum(rows) as rows 
    from stv_tbl_perm a 
    inner join stv_slices s on a.slice = s.slice 
    inner join (
    select tbl, slice, sum(num_values) as num_values, min(minvalue) as minvalue, max(maxvalue) as maxvalue 
    from svv_diskusage 
    where col = 0 
    group by 1, 2) d on a.id = d.tbl and a.slice = d.slice 
    group by 1, 2, 3, 4, 5, 6, 7, 8 
) as a 
join pg_class as pgc on pgc.oid = a.id 
left join pg_user use on (pgc.relowner = use.usesysid) 
join pg_namespace as pgn on pgn.oid = pgc.relnamespace 
    -- leave out system schemas 
    and pgn.nspowner > 1 
join pg_database as pgdb on pgdb.oid = a.db_id 
join 
    (select 
    tbl, 
    slice, 
    count(*) as mbytes 
    from stv_blocklist 
    group by tbl, slice 
) b on a.id = b.tbl 
    and a.slice = b.slice 
order by mbytes desc, a.db_id, a.name, a.node; 
9

我知道這個問題是舊的,已經接受了答案,但我必須指出的是,答案是錯誤的。 查詢以「mb」輸出的內容實際上是「塊數」。只有塊大小爲1MB(這是默認值),答案纔是正確的。

如果塊大小不同(在我的情況下,例如是256K),則必須將塊數乘以其大小(以字節爲單位)。我建議如下修改您的查詢,我的塊大小相乘以字節爲單位(262144個字節)塊的數量,然後通過(1024 * 1024),以MB爲單位的總分爲輸出:

select 
    trim(pgdb.datname) as Database, 
    trim(pgn.nspname) as Schema, 
    trim(a.name) as Table, 
    b.mbytes as previous_wrong_value, 
    (b.mbytes * 262144)::bigint/(1024*1024) as "Total MBytes", 
    a.rows 
from (
    select db_id, id, name, sum(rows) as rows 
    from stv_tbl_perm a 
    group by db_id, id, name 
) as a 
join pg_class as pgc on pgc.oid = a.id 
join pg_namespace as pgn on pgn.oid = pgc.relnamespace 
join pg_database as pgdb on pgdb.oid = a.db_id 
join (
    select tbl, count(blocknum) as mbytes 
    from stv_blocklist 
    group by tbl 
) b on a.id = b.tbl 
order by mbytes desc, a.db_id, a.name; 
+0

是否可以在redshift中更改塊大小?我一直在尋找這方面的信息,但沒有找到任何方法。 –

+0

我相信你可以。在以前的Paraccel(實際Actian Matrix - redshift的前身)上,您可以通過更改padb.conf中的block_size的值來控制它。在紅移上應該是同一行上的東西 – Diego