對於那些有興趣在一些指標,這裏有一系列的SQL語句來測試一個字符列的查詢(VARCHAR2(1))和1位柱(NUMBER(1))。
測試設置 - 使用字符狀態列和數字狀態列創建100,000,000行表。運行一個簡單的查詢來使用字符狀態過濾器對行進行計數,並將其時間與使用數字狀態運行類似查詢的時間進行比較。
執行摘要 - 差別不明顯。
SQL> create table some_100_rows
2 as
3 select rownum as rnum
4 from dual
5 connect by level <= 100;
Table created.
SQL> create table some_1000000_rows
2 as
3 select ROWNUM as id
4 , cast(case when mod(rownum, 2) = 0 then 'S' else 'C' end as varchar2(1)) as varchar_status
5 , cast(case when mod(rownum, 2) = 0 then 1 else 2 end as number(1)) as num_status
6 from dual
7 connect by level <= 1000000
8 ;
Table created.
Elapsed: 00:00:01.46
(我保持數據和分佈簡單,所以字符檢索和搜索數在做同樣的事情,時間差異應該是由於單獨的數據類型。)
SQL> create table test_varchar_vs_number -- a table of 100,000,000 rows
2 as
3 select t1.*
4 from some_1000000_rows t1
5 cross join
6 some_100_rows t2
7 ;
Table created.
Elapsed: 00:00:37.96
SQL> select count(*)
2 from test_varchar_vs_number
3 ;
COUNT(*)
----------
100000000
Elapsed: 00:00:10.54
請注意,只計算表格需要大約10秒。
這裏的內容是什麼樣子:
SQL> select *
2 from test_varchar_vs_number
3 where rownum < 11;
ID VARCHAR_STATUS NUM_STATUS
---------- -------------- ----------
1 C 2
2 S 1
3 C 2
4 S 1
5 C 2
6 S 1
7 C 2
8 S 1
9 C 2
10 S 1
10 rows selected.
Elapsed: 00:00:00.04
運行選擇帶有「S」在VARCHAR_STATUS列數行數。重複幾次以獲得穩定的指標。
SQL> select count(*)
2 from test_varchar_vs_number
3 where varchar_status = 'S'
4 ;
COUNT(*)
----------
50000000
**Elapsed: 00:00:11.82**
SQL> select count(*)
2 from test_varchar_vs_number
3 where varchar_status = 'S'
4 ;
COUNT(*)
----------
50000000
**Elapsed: 00:00:11.05**
SQL> select count(*)
2 from test_varchar_vs_number
3 where varchar_status = 'S'
4 ;
COUNT(*)
----------
50000000
**Elapsed: 00:00:11.37**
因此只需要超過11秒來計算50,000,000個「S」行。
現在嘗試與行同樣的事情用一個1在NUMBER_STATUS列:
SQL> select count(*)
2 from test_varchar_vs_number
3 where num_status = 1;
COUNT(*)
----------
50000000
**Elapsed: 00:00:11.04**
SQL> select count(*)
2 from test_varchar_vs_number
3 where num_status = 1;
COUNT(*)
----------
50000000
**Elapsed: 00:00:10.79**
SQL> select count(*)
2 from test_varchar_vs_number
3 where num_status = 1;
COUNT(*)
----------
50000000
**Elapsed: 00:00:10.59**
所以,不同的是可以忽略不計。 (Min chacter搜索時間:11.05s與最小搜索時間10.59s。)
編輯: 對於那些對低級別細節感興趣的人,這裏是來自通過tkprof的10046蹤跡的統計。這是從上面單獨運行,所以不要指望時間完全匹配。 (請記住,查詢的所有3次運行的總時間。)
select count(*)
from test_varchar_vs_number
where num_status = 1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 3 0.00 0.00 0 0 0 0
Execute 3 0.00 0.00 0 0 0 0
Fetch 6 11.85 34.30 621984 622005 0 3
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 12 11.85 34.30 621984 622005 0 3
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 110
Number of plan statistics captured: 3
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
1 1 1 SORT AGGREGATE (cr=207335 pr=207328 pw=0 time=11434679 us)
50000000 50000000 50000000 TABLE ACCESS FULL TEST_VARCHAR_VS_NUMBER (cr=207335 pr=207328 pw=0 time=10113986 us cost=56992 size=150000000 card=50000000)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 6 0.00 0.00
reliable message 1 0.00 0.00
enq: KO - fast object checkpoint 1 0.13 0.13
direct path read 4835 0.29 22.04
SQL*Net message from client 6 0.01 0.04
********************************************************************************
select count(*)
from test_varchar_vs_number
where varchar_status = 'S'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 3 0.00 0.00 0 0 0 0
Execute 3 0.00 0.00 0 0 0 0
Fetch 6 11.20 33.43 621984 622005 0 3
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 12 11.20 33.43 621984 622005 0 3
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 110
Number of plan statistics captured: 3
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
1 1 1 SORT AGGREGATE (cr=207335 pr=207328 pw=0 time=11146155 us)
50000000 50000000 50000000 TABLE ACCESS FULL TEST_VARCHAR_VS_NUMBER (cr=207335 pr=207328 pw=0 time=9700296 us cost=56940 size=100000000 card=50000000)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 6 0.00 0.00
reliable message 1 0.00 0.00
enq: KO - fast object checkpoint 1 0.21 0.21
direct path read 4873 0.25 22.12
SQL*Net message from client 6 0.03 0.05
********************************************************************************
您將一次檢索多少條記錄並檢查/比較它們的狀態? 1千,10萬,100萬,1億? – krokodilko
不知道,可能超過這個 – Ash
我的建議是一個簡短的varchar2列,存儲一個固定長度的alpha代碼,另一個兩列表將代碼關聯到描述性名稱。對代碼進行一些真正的思考,以提出符合當前需求的標準,但也可以在未來將其添加到角落的情況下進行添加。這實際上並不特定於任何數據庫產品。這是基本的數據設計。 – EdStevens