2016-12-05 52 views
2

我需要估計數據庫大小的先決條件,所以我想了解SQL Server如何在示例波紋管中存儲數據。估計SQL Server中的表大小

在我的SQL Server數據庫中,我有一個名爲InfoComp至極表包含4行:

IdInfoComp : Integer Not Null (PK) 
IdDefinition : Integer Not Null (FK) 
IdObject : Integer Not Null (FK) 
Value : NVarChar(Max) Not Null 

我想估計表的大小。在實際使用,我可以得到存儲在Value平均長度與此SQL查詢:

SELECT AVG(Value) FROM InfoComp 
Result : 8 

所以,我的計算似乎是(字節):

(Size(IdInfoComp) + Size(IdDefinition) + Size(IdObject) + AVG Size(Value)) * Rows count 

(4 + 4 + 4 + ((8 * 2) + 2)) * NbRows 

但是,當我試圖在實際情況下應用此計算,這是錯誤的。在我的情況下,我有3,250,273行,因此結果應該是92 MB,但是MS SQL Report說:

(數據)147 888 KB(索引)113 072 KB和(保留)261 160 KB。

我在哪裏錯了?

+3

我不確定,但一個快速的谷歌搜索帶來了一個官方的MSDN頁面,有更具體的說明...也許嘗試嗎? https://msdn.microsoft.com/en-us/library/ms175991.aspx(看起來也許它是計算索引的大小,這與整個數據足跡完全不同) – jleach

+0

我實際上正在嘗試。但我有一些差異(近10MB)。 – Cedric

+1

而不是平均水平,你可以將「價值」的總和加起來。 –

回答

2

試試這個......讓我接近。我使用msdn文章來創建。您可以設置行數。這將會執行db中的每個表,包括索引。目前還不會處理列存儲,也不會處理關係。它只會將行數估算應用於每個表格。

/*Do NOT change this section*/ 
GO 
CREATE TABLE RowSizes (TypeName VARCHAR(30), TableName VARCHAR(255), IndexName VARCHAR(255), Null_Bitmap SMALLINT, VariableFieldSize BIGINT, FixedFieldSize BIGINT, Row_Size BIGINT, LOBFieldSize BIGINT); 
CREATE TABLE LeafSizes (TypeName VARCHAR(30), TableName VARCHAR(255), IndexName VARCHAR(255), Row_Size BIGINT, Rows_Per_Page BIGINT, Free_Rows_Per_Page BIGINT, Non_Leaf_Levels BIGINT, Num_Leaf_Pages BIGINT, Num_Index_Pages BIGINT, Leaf_space_used_bytes BIGINT); 
GO 
CREATE PROCEDURE dbo.cp_CalcIndexPages 
    @IndexType VARCHAR(20) 
AS 
BEGIN 
    DECLARE @IndexName VARCHAR(255) 
     , @TableName varchar(255) 
     , @Non_Leaf_Levels bigint = 127 
     , @Rows_Per_Page bigint = 476 
     , @Num_Leaf_Pages bigint =10000; 

    WHILE EXISTS(SELECT TOP 1 1 FROM dbo.LeafSizes WHERE TypeName = @IndexType AND Num_Index_Pages = 0)-- AND IndexName = 'PK_ProcessingMessages') 
    BEGIN 
     SELECT TOP 1 @IndexName = IndexName 
      , @TableName = TableName 
      , @Non_Leaf_Levels = Non_Leaf_Levels 
      , @Rows_Per_Page = Rows_Per_Page 
      , @Num_Leaf_Pages = Num_Leaf_Pages 
     FROM dbo.LeafSizes 
     WHERE TypeName = @IndexType 
      AND Num_Index_Pages = 0; 

     DECLARE @Counter INT = 1 
      , @Num_Index_Pages INT = 0; 

     WHILE @Counter <= @Non_Leaf_Levels 
     BEGIN 
      BEGIN TRY 

      SELECT @Num_Index_Pages += ROUND(CASE WHEN @Num_Leaf_Pages/POWER(@Rows_Per_Page, @Counter) < CONVERT(FLOAT, 1) THEN 1 ELSE @Num_Leaf_Pages/POWER(@Rows_Per_Page, @Counter) END, 0) 
      END TRY 

      BEGIN CATCH 
       SET @Num_Index_Pages += 1 
      END CATCH 

      SET @Counter += 1 
     END 

     IF @Num_Index_Pages = 0 
      SET @Num_Index_Pages = 1; 

     UPDATE dbo.LeafSizes 
     SET Num_Index_Pages = @Num_Index_Pages 
      , Leaf_space_used_bytes = 8192 * @Num_Index_Pages 
     WHERE TableName = @TableName 
      AND IndexName = @IndexName; 

    END 
END 
GO 
/*Do NOT change above here*/ 

--Set parameters here 
DECLARE @NumRows INT = 1000000 --Number of rows for estimate 
    ,@VarPercentFill money = .6; --Percentage of variable field space used to estimate. 1 will provide estimate as if all variable columns are 100% full. 


/*Do not change*/ 
WITH cte_Tables AS (--Get Tables 
    SELECT o.object_id, s.name+'.'+o.name AS ObjectName 
    FROM sys.objects o 
    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id 
    WHERE type = 'U' 
), cte_TableData AS (--Calculate Field Sizes 
    SELECT o.ObjectName AS TableName 
     , SUM(CASE WHEN t.name IN ('int', 'bigint', 'tinyint', 'char', 'datetime', 'smallint', 'date') THEN 1 ELSE 0 END) AS FixedFields 
     , SUM(CASE WHEN t.name IN ('int', 'bigint', 'tinyint', 'char', 'datetime', 'smallint', 'date') THEN c.max_length ELSE 0 END) AS FixedFieldSize 
     , SUM(CASE WHEN t.name IN ('varchar') THEN 1 ELSE 0 END) AS VariableFields 
     , SUM(CASE WHEN t.name IN ('varchar') THEN c.max_length ELSE 0 END)*@VarPercentFill AS VariableFieldSize 
     , SUM(CASE WHEN t.name IN ('xml') THEN 1 ELSE 0 END) AS LOBFields 
     , SUM(CASE WHEN t.name IN ('xml') THEN 10000 ELSE 0 END) AS LOBFieldSize 
     , COUNT(1) AS TotalColumns 
    FROM sys.columns c 
    INNER JOIN cte_Tables o ON o.object_id = c.object_id 
    INNER JOIN sys.types t ON c.system_type_id = t.system_type_id 
    GROUP BY o.ObjectName 
), cte_Indexes AS (--Get Indexes and size 
    SELECT s.name+'.'+o.name AS TableName 
     , ISNULL(i.name, '') AS IndexName 
     , i.type_desc 
     , i.index_id 
     , SUM(CASE WHEN t.name IN ('tinyint','smallint', 'int', 'bigint', 'char', 'datetime', 'date') AND c.key_ordinal > 0 THEN 1 ELSE 0 END) AS FixedFields 
     , SUM(CASE WHEN t.name IN ('tinyint','smallint', 'int', 'bigint', 'char', 'datetime', 'date') AND c.key_ordinal > 0 THEN tc.max_length ELSE 0 END) AS FixedFieldSize 
     , SUM(CASE WHEN t.name IN ('varchar') AND c.key_ordinal > 0 THEN 1 ELSE 0 END) AS VariableFields 
     , SUM(CASE WHEN t.name IN ('varchar') AND c.key_ordinal > 0 THEN tc.max_length ELSE 0 END)*@VarPercentFill AS VariableFieldSize 
     , SUM(CASE WHEN t.name IN ('xml') AND c.key_ordinal > 0 THEN 1 ELSE 0 END) AS LOBFields 
     , SUM(CASE WHEN t.name IN ('xml') AND c.key_ordinal > 0 THEN 10000 ELSE 0 END) AS LOBFieldSize 
     , SUM(CASE WHEN t.name IN ('tinyint','smallint', 'int', 'bigint', 'char', 'datetime', 'date') AND c.is_included_column > 0 THEN 1 ELSE 0 END) AS FixedIncludes 
     , SUM(CASE WHEN t.name IN ('tinyint','smallint', 'int', 'bigint', 'char', 'datetime', 'date') AND c.is_included_column > 0 THEN 1 ELSE 0 END) AS FixedIncludesSize 
     , SUM(CASE WHEN t.name IN ('varchar') AND c.is_included_column > 0 THEN 1 ELSE 0 END)*@VarPercentFill AS VariableIncludes 
     , SUM(CASE WHEN t.name IN ('varchar') AND c.is_included_column > 0 THEN tc.max_length ELSE 0 END) AS VariableIncludesSize 
     , COUNT(1) AS TotalColumns 
    FROM sys.indexes i 
    INNER JOIN sys.columns tc ON i.object_id = tc.object_id 
    INNER JOIN sys.index_columns c ON i.index_id = c.index_id 
     AND c.column_id = tc.column_id 
     AND c.object_id = i.object_id 
    INNER JOIN sys.objects o ON o.object_id = i.object_id AND o.is_ms_shipped = 0 
    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id 
    INNER JOIN sys.types t ON tc.system_type_id = t.system_type_id 
    GROUP BY s.name+'.'+o.name, ISNULL(i.name, ''), i.type_desc, i.index_id 
) 
INSERT RowSizes 
SELECT 'Table' AS TypeName 
    , n.TableName 
    , '' AS IndexName 
    , 2 + ((n.FixedFields+n.VariableFields+7)/8) AS Null_Bitmap 
    , 2 + (n.VariableFields * 2) + n.VariableFieldSize AS Variable_Data_Size 
    , n.FixedFieldSize 
    /*FixedFieldSize + Variable_Data_Size + Null_Bitmap*/ 
    , n.FixedFieldSize + (2 + (n.VariableFields * 2) + (n.VariableFieldSize)) + (2 + ((n.FixedFields+n.VariableFields+7)/8)) + 4 AS Row_Size 
    , n.LOBFieldSize 
FROM cte_TableData n 
UNION 
SELECT i.type_desc 
    , i.TableName 
    , i.IndexName 
    , 0 AS Null_Bitmap 
    , CASE WHEN i.VariableFields > 0 THEN 2 + (i.VariableFields * 2) + i.VariableFieldSize + 4 ELSE 0 END AS Variable_Data_Size 
    , i.FixedFieldSize 
    /*FixedFieldSize + Variable_Data_Size + Null_Bitmap if not clustered*/ 
    , i.FixedFieldSize + CASE WHEN i.VariableFields > 0 THEN 2 + (i.VariableFields * 2) + i.VariableFieldSize + 4 ELSE 0 END + 7 AS Row_Size 
    , i.LOBFieldSize 
FROM cte_Indexes i 
WHERE i.index_id IN(0,1) 
UNION 
SELECT i.type_desc 
    , i.TableName 
    , i.IndexName 
    , CASE WHEN si.TotalColumns IS NULL THEN 2 + ((i.FixedFields+i.VariableFields+i.VariableIncludes+i.FixedIncludes+8)/8) 
      ELSE 2 + ((i.FixedFields+i.VariableFields+i.VariableIncludes+i.FixedIncludes+7)/8) 
     END AS Null_Bitmap 
    , CASE WHEN si.TotalColumns IS NULL THEN 2 + ((i.VariableFields + 1) * 2) + (i.VariableFieldSize + 8) 
      ELSE 2 + (i.VariableFields * 2) + i.VariableFieldSize 
     END AS Variable_Data_Size 
    , CASE WHEN si.TotalColumns IS NULL THEN si.FixedFieldSize 
      ELSE i.FixedFieldSize + si.FixedFieldSize 
     END AS FixedFieldSize 
    /*FixedFieldSize + Variable_Data_Size + Null_Bitmap if not clustered*/ 
    , CASE WHEN si.TotalColumns IS NULL THEN i.FixedFieldSize + (2 + ((i.VariableFields + 1) * 2) + (i.VariableFieldSize + 8)) + (2 + ((i.TotalColumns+8)/8)) + 7 
      ELSE i.FixedFieldSize + (2 + (i.VariableFields * 2) + i.VariableFieldSize) + (2 + ((i.TotalColumns+7)/8)) + 4 
     END AS Row_Size 
    , i.LOBFieldSize 
FROM cte_Indexes i 
LEFT OUTER JOIN cte_Indexes si ON i.TableName = si.TableName AND si.type_desc = 'CLUSTERED' 
WHERE i.index_id NOT IN(0,1) AND i.type_desc = 'NONCLUSTERED'; 

--SELECT * FROM RowSizes 

/*Calculate leaf sizes for tables and HEAPs*/ 
INSERT LeafSizes 
SELECT r.TypeName 
    , r.TableName 
    ,'' AS IndexName 
    , r.Row_Size 
    , 8096/(r.Row_Size + 2) AS Rows_Per_Page 
    , 8096 * ((100 - 90)/100)/(r.Row_Size + 2) AS Free_Rows_Per_Page 
    , 0 AS Non_Leaf_Levels 
    /*Num_Leaf_Pages = Number of Rows/(Rows_Per_Page - Free_Rows_Per_Page) OR 1 if less than 1*/ 
    , CASE WHEN @NumRows/((8096/(r.Row_Size + 2)) - (8096 * ((100 - 90)/100)/(r.Row_Size + 2))) < 1 
      THEN 1 
      ELSE @NumRows/((8096/(r.Row_Size + 2)) - (8096 * ((100 - 90)/100)/(r.Row_Size + 2))) 
     END AS Num_Leaf_Pages 
    , 0 AS Num_Index_Pages 
    /*Leaf_space_used = 8192 * Num_Leaf_Pages*/ 
    , 8192 * CASE WHEN @NumRows/((8096/(r.Row_Size + 2)) - (8096 * ((100 - 90)/100)/(r.Row_Size + 2))) < 1 
       THEN 1 
       ELSE @NumRows/((8096/(r.Row_Size + 2)) - (8096 * ((100 - 90)/100)/(r.Row_Size + 2))) 
      END + (@NumRows * LOBFieldSize) AS Leaf_space_used_bytes 
FROM RowSizes r 
WHERE r.TypeName = 'Table' 
ORDER BY TypeName, TableName; 

/*Calculate leaf sizes for CLUSTERED indexes*/ 
INSERT LeafSizes 
SELECT r.TypeName 
    , r.TableName 
    , r.IndexName 
    , r.Row_Size 
    , 8096/(r.Row_Size + 2) AS Rows_Per_Page 
    , 0 AS Free_Rows_Per_Page 
    , 1 + ROUND(LOG(8096/(r.Row_Size + 2)), 0)*(l.Num_Leaf_Pages/(8096/(r.Row_Size + 2))) AS Non_Leaf_Levels 
    , l.Num_Leaf_Pages 
    , 0 AS Num_Index_Pages 
    , 0 AS Leaf_space_used_bytes 
FROM RowSizes r 
INNER JOIN LeafSizes l ON r.TableName = l.TableName AND l.TypeName = 'Table' 
WHERE r.TypeName = 'CLUSTERED'; 

PRINT 'CLUSTERED' 
EXEC dbo.cp_CalcIndexPages @IndexType = 'CLUSTERED' 

/*Calculate leaf sizes for NONCLUSTERED indexes*/ 
INSERT LeafSizes 
SELECT r.TypeName 
    , r.TableName 
    , r.IndexName 
    , r.Row_Size 
    , 8096/(r.Row_Size + 2) AS Rows_Per_Page 
    , 0 AS Free_Rows_Per_Page 
    , 1 + ROUND(LOG(8096/(r.Row_Size + 2)), 0)*(l.Num_Leaf_Pages/(8096/(r.Row_Size + 2))) AS Non_Leaf_Levels 
    , l.Num_Leaf_Pages 
    , 0 AS Num_Index_Pages 
    , 0 AS Leaf_space_used_bytes 
FROM RowSizes r 
INNER JOIN LeafSizes l ON r.TableName = l.TableName AND l.TypeName = 'Table' 
WHERE r.TypeName = 'NONCLUSTERED'; 

PRINT 'NONCLUSTERED' 
EXEC dbo.cp_CalcIndexPages @IndexType = 'NONCLUSTERED' 

SELECT * 
FROM dbo.LeafSizes 
--WHERE TableName = 'eligibility.clientrequest' 

SELECT TableName 
    , @NumRows AS RowsPerTable 
    , @VarPercentFill*100 AS VariableFieldFillFactor 
    , SUM(CASE WHEN TypeName = 'Table' THEN Leaf_space_used_bytes ELSE 0 END)/1024/1024 AS TableSizeMB 
    , SUM(Leaf_space_used_bytes)/1024/1024 AS SizeWithIndexesMB 
FROM LeafSizes 
--WHERE TableName = 'eligibility.clientrequest' 
GROUP BY TableName 
ORDER BY TableName; 


GO 
/*Cleanup when done*/ 
DROP PROCEDURE dbo.cp_CalcIndexPages; 
DROP TABLE dbo.RowSizes; 
DROP TABLE dbo.LeafSizes; 
0

不幸的是,我不能說,爲什麼你的計算是錯誤的,因爲是關於如何創建表和數據庫是如何配置沒有足夠的信息。所以我會盡量共同回答,並且您會收到小費。

您應該首先知道任何SQL Server數據庫的大小大於或等於model數據庫的大小。這是因爲model數據庫是新數據庫的模板,因此每次執行CREATE DATABASE語句時都會複製它。

數據庫中的所有信息都存儲在磁盤上的8 KB頁面中。有許多類型的頁面。其中一些(如分配圖和元數據)用於內部目的,而其他用於存儲數據。

表大小取決於磁盤上組織的數據(是否具有聚簇索引),列類型和數據壓縮。索引的大小取決於索引表上唯一索引的存在,索引的級數,填充因子等。

正如我之前所說,一切都存儲在頁面和數據中。 SQL Server具有行內數據的頁面,行溢出數據的頁面以及LOB數據的頁面。數據頁面由三個主要部分組成:頁眉,數據行和數據偏移量數組。

頁眉佔用每個數據頁的前96個字節,其餘組件留下8,096個字節。行偏移量數組是存儲在頁面末尾的一個2字節條目塊。輸入計數存儲在標題中並稱爲時隙計數。

標題和行偏移量數組之間的區域是存儲數據行的區域。每行由兩部分組成:固定大小部分和可變長度部分。

數據行的結構是:

  • 狀態位A,1個字節
  • 狀態位B,1個字節
  • 固定長度尺寸(FSIZE),2個字節
  • 定點長度數據(FDATA),FSIZE - 4
  • 編號列(NCOL)的,2個字節
  • NULL位圖,天花(NCOL/8)
  • 編號存儲在行(VarCount)可變長度列的,2個字節
  • 變量列偏移陣列(VarOffset),2 * VarCount
  • 可變長度數據(VARDATA),VarOff [VarCount] - (FSIZE + 4 +天花板(NCOL/8)+ 2 * VarCount)

索引行存儲在相同的方式,數據行。

並非我在這裏解釋的所有東西,但我希望這將幫助您瞭解SQL Server使用分配空間的目的。另外,您應該記住,數據庫文件按照FILEGROWTH選項指定的大小增長,這可能導致實際大小大於估計的大小。

也看看Microsoft SQL Server 2012 Internals的書,並閱讀如何Estimate the Size of a Database。它可能會對你有趣。