2011-08-11 94 views
1

難道你們,請查看下面的查詢到Oracle數據庫,並指出什麼是錯的:使用IN ...的SELECT子句很慢?

SELECT t1.name FROM t1, t2 WHERE t1.id = t2.id AND t2.empno IN (1, 2, 3, …, 200) 

查詢統計:

  • 耗時:10.53秒。

指數:

  • t2.empno索引。

  • t1.id被索引。

  • t2.id被索引。

更新


上述查詢只是我使用所述查詢的樣品複製品。 更真實的形式在這裏下面

解釋計劃 Explain Plan

查詢:

SELECT 
    PRODUCT_REPRESENTATION_SK 
FROM 
    Product_Representation pr 
    , Design_Object do 
    , Files files 
    ,EPS_STATUS epsStatus 
    ,EPS_ERROR_CODES epsError 
    ,VIEW_TYPE viewTable 
WHERE 
    pr.DESIGN_OBJECT_SK = do.DESIGN_OBJECT_SK 
    AND pr.LAYER_NAME !='Layer 0' 
    AND epsStatus.EPS_STATUS_SK = pr.EPS_STATUS 
    AND epsError.EPS_ERROR_CODE = pr.EPS_ERROR_CODE 
    AND viewTable.VIEW_TYPE_ID = pr.VIEW_TYPE_ID 
    AND files.pim_id = do.PIM_ID 
    AND do.DESIGN_OBJECT_ID IN 
     (
147086,149924,140458,135068,145197,134774,141837,138568,141731,138772,143769,141739,149113,148809,141072,141732,143974,147076,143972,141078,141925,134643,139701,141729,147078,139120,137097,147072,138261,149700,149701,139127,147070,149702,136766,146829,135762,140155,148459,138061,138762............................................. 200 such numbers 
     ) 

索引Colums:

pr.DESIGN_OBJECT_SK 
do.DESIGN_OBJECT_SK 
do.DESIGN_OBJECT_ID 
files.pim_id 


TABLE "PIM"."DESIGN_OBJECT" 
( 
"DESIGN_OBJECT_SK" NUMBER(*,0) NOT NULL ENABLE, 
"PIM_ID" NUMBER(*,0) NOT NULL ENABLE, 
"DESIGN_OBJECT_TYPE_SK" NUMBER(*,0) NOT NULL ENABLE, 
"DESIGN_OBJECT_ID" VARCHAR2(40 BYTE) NOT NULL ENABLE, 
"DIVISION_CD" NUMBER(*,0), 
"STAT_IND" NUMBER(*,0) NOT NULL ENABLE, 
"STAT_CHNG_TMST" TIMESTAMP (6), 
"CRTD_BY" VARCHAR2(45 BYTE), 
"CRT_TMST" TIMESTAMP (6), 
"MDFD_BY" VARCHAR2(45 BYTE), 
"CHNG_TMST" TIMESTAMP (6), 
"UPDATE_CNT" NUMBER(*,0), 
"GENDER" VARCHAR2(1 BYTE), 

PRIMARY KEY ("DESIGN_OBJECT_SK") 
) 
TABLESPACE "PIM" ENABLE, 

FOREIGN KEY ("DESIGN_OBJECT_TYPE_SK") 
    REFERENCES "PIM"."DESIGN_OBJECT_TYPE" ("DESIGN_OBJECT_TYPE_SK") 
     ON DELETE CASCADE ENABLE, 

FOREIGN KEY ("PIM_ID") 
    REFERENCES "PIM"."FILES" ("PIM_ID") 
     ON DELETE CASCADE ENABLE 

) 

表2


CREATE TABLE "PIM"."PRODUCT_REPRESENTATION" 
(
"PRODUCT_REPRESENTATION_SK" NUMBER(*,0) NOT NULL ENABLE, 
"DESIGN_OBJECT_SK" NUMBER(*,0) NOT NULL ENABLE, 
"VIEW_TYPE_ID" NUMBER(*,0) NOT NULL ENABLE, 
"LAYER_NAME" VARCHAR2(255 BYTE), 
"STAT_IND" NUMBER(*,0) NOT NULL ENABLE, 
"STAT_CHNG_TMST" TIMESTAMP (6), 
"CRTD_BY" VARCHAR2(45 BYTE), 
"CRT_TMST" TIMESTAMP (6), 
"MDFD_BY" VARCHAR2(45 BYTE), 
"CHNG_TMST" TIMESTAMP (6), 
"UPDATE_CNT" NUMBER(*,0), 
"EPS_STATUS" VARCHAR2(30 BYTE) NOT NULL ENABLE, 
"EPS_GENERATED_TIME" TIMESTAMP (6), 
"EPS_ERROR_CODE" NUMBER, 
"EPS_ERROR_DETAILS" VARCHAR2(500 BYTE), 
"DEEPSERVER_ASSET_LAYER_ID" VARCHAR2(255 BYTE), 
"PRODUCT_REPRESENTATION_LOC" VARCHAR2(255 BYTE), 

PRIMARY KEY ("PRODUCT_REPRESENTATION_SK") 
) 
TABLESPACE "PIM" ENABLE, 

FOREIGN KEY ("DESIGN_OBJECT_SK") 
    REFERENCES "PIM"."DESIGN_OBJECT" ("DESIGN_OBJECT_SK") 
     ON DELETE CASCADE ENABLE, 
FOREIGN KEY ("VIEW_TYPE_ID") 
    REFERENCES "PIM"."VIEW_TYPE" ("VIEW_TYPE_ID") 
     ON DELETE CASCADE ENABLE, 

CONSTRAINT "EPS_ERROR_CODE_FK" 
FOREIGN KEY ("EPS_ERROR_CODE") 
    REFERENCES "PIM"."EPS_ERROR_CODES" ("EPS_ERROR_CODE") 
     ON DELETE CASCADE ENABLE, 
CONSTRAINT "EPS_STATUS_FK" 
FOREIGN KEY ("EPS_STATUS") 
    REFERENCES "PIM"."EPS_STATUS" ("EPS_STATUS_SK") 
     ON DELETE CASCADE ENABLE 
) 
+0

您正在使用什麼數據庫? SQL Server? – DOK

+2

任何執行計劃?似乎解析可能很昂貴,而不是查詢部分。 (加上你有一個額外的逗號 - 我認爲這是一個粘貼問題) – Randy

+2

也許,也許只是你的例子的神器..但你應該使用<200,而不是那個大的IN字符串... – Randy

回答

2

不要使用交叉聯接。

試試這個

SELECT 
    t1.name 
FROM 
    t1 
JOIN t2 
    ON t2.id = t1.id 
WHERE 
    t2.empno IN (1,...,200) 

編輯:編輯後,看到在笛卡爾乘積的多個表,它可能是非常重要的,你使用正確的語法JOIN

+2

這是問題的有效答案。 Downvoter,請提供反饋。 –

+0

@downvoter謹慎解釋? – Matthew

+0

我還沒有降低,因爲這是一個很好的建議。但這不是緩慢表現的答案。對於這樣一個簡單的查詢,編寫查詢的任何一種方式都應該導致相同的執行計劃。 –

6

錯誤的第一件事是使用implict join語法。這是一個SQL反模式。

如果你在IN子句中有一個很大的列表,你是否嘗試過把它們放在一個表中而使用連接?

什麼數據庫?你看過你的解釋計劃或執行計劃,看看經濟放緩的地方嗎?

+0

+1將「IN(..)」放在表中可以提高性能,具體取決於這裏所有表的大小......或者,如果你有SQL2008,你可以在TVP – Matthew

5

讓我們暫時忘記empno BETWEEN 1 and 200的建議,並假設你有t2.empno IN (3,7,...,5209)(200條目)。

你也可以寫你的查詢(這是一個隱藏的連接查詢)到非等價 EXISTS查詢是否顯示相同的結果(但可能更少的行),並應該比join快:

SELECT 
    t1.name 
FROM 
    t1 
WHERE EXISTS 
     (SELECT * 
     FROM t2 
     WHERE t2.id = t1.id 
      AND t2.empno IN (3,7,...,5209) 
    ) 

(胡亂猜測)

如果在另一方面,它甚至不是t2.empno IN (3,7,...,5209)t2.empno IN (SELECT tx.empno FROM tx WHERE someConditions)你正在使用MySQL,那麼這是你問題的根源(MySQL已知不能以最好的方式處理field IN (SELECT f FROM x))。所以,你可以更改查詢到:

SELECT 
    t1.name 
FROM 
    t1 
    JOIN t2 
    ON t2.id = t2.id 
    JOIN tx 
    ON tx.empno = t2.empno 
WHERE 
    someConditions 

或甚至:

SELECT 
    t1.name 
FROM 
    t1 
WHERE EXISTS 
     (SELECT * 
     FROM t2 
      JOIN tx 
      ON tx.empno = t2.empno 
     WHERE t2.id = t1.id 
      AND someConditions 
    ) 
+0

+1中發送一個很好的觀點MySQL及其對IN(子查詢)的處理 – Matthew