2014-02-05 57 views
0

所以這是我的問題。完整性審計

我需要計算每個數據庫中每個表中的所有不同值。

Example: 
[db 1] 
[table 1] 
[column 1] count() 
[column 1] distinct() 
[column 1] count() 
[column 1] distinct() 
[column 2] count() 
[column 2] distinct() etc 

[db 2] 
[table 1] 
[column 1] count() 
[column 1] distinct() 
[column 2] count() 
[column 2] distinct() etc 

現在我有這個至今:

DECLARE @TableName VARCHAR (MAX) =  'sales' 

SELECT DISTINCT 
    'SELECT ' 
    + RIGHT (ColumnList, LEN (ColumnList) - 1) 
    + ' FROM ' 
    + Table_Name +' group BY '+ (ColumnList)',' 
    FROM INFORMATION_SCHEMA.COLUMNS COL1 
    CROSS APPLY (SELECT ', COUNT (' + COLUMN_NAME + ')'+ ','+COLUMN_NAME 
        FROM INFORMATION_SCHEMA.COLUMNS COL2 
       WHERE COL1.TABLE_NAME = COL2.TABLE_NAME 
       FOR XML PATH ('')) TableColumns (ColumnList) 

WHERE 1 = 1 AND COL1.TABLE_NAME = @TableName 

所以我只需要同組的幫助。

+0

您的環境對更高級別的腳本如bash/ksh或python有多開放? – gbtimmon

+0

非常開放。我在我自己的機器上運行它。 – Ninety3cents

+0

所以你需要每個表中每個列的所有行的count()和所有數據庫中每個表中每個列的所有不同行的count()? – 2014-02-05 20:34:43

回答

0

Ohh noo @Sauce再次使用了可怕的CURSOR。

嗯,這是只是解決方案的一部分,這是不是最好的部分,但在這裏我將如何做到這一點。

下面的代碼將返回所有行的COUNT()COUNT(DISTINCT Column)從當前數據庫中每個表,你只需要弄清楚如何執行它的每個DB

不管你怎麼做,你必須考慮到,COUNT (*)和COUNT(DISTINCT)是昂貴的操作,所以最好將結果保存到上次執行的日期。根據數據庫的大小,這可能需要幾個小時(也許是幾天)才能完成。

IF OBJECT_ID('tempdb.dbo.#tempTable') IS NOT NULL 
    DROP TABLE #tempTable; 

CREATE TABLE #tempTable 
    (
    TableName VARCHAR(250) 
    ,ColumnName VARCHAR(250) 
    ,TotalCount BIGINT 
    ,DistinctCount BIGINT 
    ) 

DECLARE @column_name VARCHAR(250) 
    ,@table_name VARCHAR(150) 
    ,@SQLStatement NVARCHAR(500) 

DECLARE table_cursor CURSOR 
FOR 
    SELECT DISTINCT OBJECT_NAME(object_id) 
     FROM sys.tables t 
     ORDER BY 1 

OPEN table_Cursor 
FETCH NEXT FROM table_cursor INTO @table_name 

WHILE @@FETCH_STATUS = 0 
    BEGIN 

     DECLARE column_cursor CURSOR 
     FOR 
      SELECT DISTINCT name 
       FROM sys.columns c 
       WHERE OBJECT_NAME(object_id) = @table_name 

     OPEN column_cursor 

     FETCH NEXT FROM column_cursor 
     INTO @column_name 

     WHILE @@FETCH_STATUS = 0 
      BEGIN 
       SET @SQLStatement = N' 
       INSERT INTO #tempTable 
       SELECT ''' + @table_name + ''' 
       ,''' + @column_name + ''' 
       ,COUNT(*) 
       ,COUNT(Distinct ' + @column_name + ') 
       FROM ' + @table_name + ' WITH(NOLOCK)' 

       --PRINT @SQLStatement 

       EXECUTE sp_executesql 
        @SQLStatement 

       FETCH NEXT FROM column_cursor INTO @column_name 
      END 
     CLOSE column_cursor; 
     DEALLOCATE column_cursor; 

     FETCH NEXT FROM table_cursor INTO @table_name 
    END 
CLOSE table_cursor 
DEALLOCATE table_cursor 

SELECT * 
    FROM #tempTable