2014-12-11 86 views
1

加入我定義了以下在我的申請表,根據訓練日期爲每個區取報告。如何爲多個優化查詢與子查詢報告中的MySQL

wi_individual_g(ind_id, ind_district_id, ...) 
wi_individual_p(ind_id,prg_id, ind_dalit (yes/no), ind_madhesi (yes/no), ...) 
wi_training(trn_id, trn_start_date, trn_ben_type, ...) 
wi_indv_training(trn_id, ind_id) 
wi_district(dst_id,dst_name) 

我的問題:報告必須產生計數的個體區明智誰是給定的trn_start_date的訓練有關。該應用程序已預定義如下定義的日期範圍同宿舍:

$quarter=array('y1q3'=>array('2013-02-01','2013-03-31'),'y1q4'=>array('2013-04-01','2013-06-30') 
,'y2q1'=>array('2013-07-01','2013-09-30'),'y2q2'=>array('2013-10-01','2013-012-31'),'y2q3'=>array('2014-01-01','2014-03-31'),'y2q4'=>array('2014-04-01','2014-06-30') 
,'y3q1'=>array('2014-07-01','2014-09-30'),'y3q2'=>array('2014-10-01','2014-012-31'),'y3q3'=>array('2015-01-01','2015-03-31'),'y3q4'=>array('2015-04-01','2015-06-30') 
,'y4q1'=>array('2015-07-01','2015-09-30'),'y4q2'=>array('2015-10-01','2015-012-31'),'y4q3'=>array('2016-01-01','2016-03-31'),'y4q4'=>array('2016-04-01','2016-06-30') 
,'y5q1'=>array('2016-07-01','2016-09-30'),'y5q2'=>array('2016-10-01','2016-012-31'),'y5q3'=>array('2017-01-01','2017-03-31'),'y5q4'=>array('2017-04-01','2017-06-30') 
,'y6q1'=>array('2017-07-01','2017-09-30'),'y6q2'=>array('2017-10-01','2017-012-31'),'y6q3'=>array('2018-01-01','2018-03-31'),'y6q4'=>array('2018-04-01','2018-06-30')); 

如果trn_start_date被choosen爲Y4Q4然後,查詢必須計算個人區明智的每個日期範圍爲:Y1(Q1-Q4 ),Y2(Q2-Q4),Y3(Q1-Q4),Y4(Q1-Q4)分別與單一的查詢爲:

Y1 Y2 Y3 Y4 Y5 Y6 
8 3948 3511 0 0 0 

AS溶液,餘施加以下查詢:

SELECT wi_district.dst_name, 
COUNT(DISTINCT(CASE WHEN wi_training.trn_start_date BETWEEN '2017-07-01' AND '2018-06-30' AND 
ind_dalit='yes' THEN wi_individual_g.ind_id END)) AS y6 , 
COUNT(DISTINCT(CASE WHEN wi_training.trn_start_date BETWEEN '2016-07-01' AND '2017-06-30' AND  ind_dalit='yes' THEN wi_individual_g.ind_id END)) AS y5 , 
COUNT(DISTINCT(CASE WHEN wi_training.trn_start_date BETWEEN '2015-07-01' AND '2016-06-30' AND ind_dalit='yes' THEN wi_individual_g.ind_id END)) AS y4 , 
COUNT(DISTINCT(CASE WHEN wi_training.trn_start_date BETWEEN '2014-07-01' AND '2015-06-30' AND ind_dalit='yes' THEN wi_individual_g.ind_id END)) AS y3 , 
COUNT(DISTINCT(CASE WHEN wi_training.trn_start_date BETWEEN '2013-07-01' AND '2014-06-30' AND ind_dalit='yes' THEN wi_individual_g.ind_id END)) AS y2 , 
COUNT(DISTINCT(CASE WHEN wi_training.trn_start_date BETWEEN '2013-02-01' AND '2013-06-30' AND ind_dalit='yes' THEN wi_individual_g.ind_id END)) AS y1 
FROM wi_individual_g 
INNER JOIN wi_individual_p ON wi_individual_p.ind_id=wi_individual_g.ind_id AND wi_individual_g.ind_is_recepient='yes' 
INNER JOIN wi_district ON wi_district.dst_id=wi_individual_g.ind_district_id AND wi_individual_g.ind_deleted=0 
INNER JOIN wi_indv_training ON wi_indv_training.ind_id=wi_individual_g.ind_id AND wi_indv_training.is_deleted=0 
INNER JOIN wi_training ON wi_training.trn_id=wi_indv_training.trn_id AND wi_training.trn_deleted=0 AND wi_training.trn_beneficiary_type=2 AND wi_training.trn_start_date <='2018-06-30' 
GROUP BY wi_district.dst_name 

但是這個查詢需要超過5分鐘才能執行,這是最糟糕的。我也將這個指標應用於田間,但取得了相同的結果。 如果有人提供給我最好的解決方案,我會很感激。

+0

'計數(不同)'可能需要很長的時間。開始刪除這些條款,看看他們是否是問題。這會給你一個解決問題的方向。 – 2014-12-11 02:40:58

+0

我必須計算獨特的個人。所以如果我清除不同,那麼參與不同訓練的同一個人的計數就會增加。順便說一句,刪除獨特也沒有工作。 – 2014-12-11 03:06:03

+0

其中表格是「ind_dalit」。你沒有表/別名引用,沒有表結構是不明確的。 – DRapp 2014-12-11 03:20:55

回答

0

我發現了3倍,以提高性能的方式:

At first : the query took around 128 secs 
After suggestion: the query took around 78 secs 
Further modification: the query took around 23 secs 
--------------------------------------------------------------------------------- 
SELECT d.dst_name, 
COUNT(DISTINCT(CASE WHEN a.trn_start_date BETWEEN '2014-07-01' AND '2015-06-30' THEN a.ind_id END)) AS y3 , 
COUNT(DISTINCT(CASE WHEN a.trn_start_date BETWEEN '2013-07-01' AND '2014-06-30' THEN a.ind_id END)) AS y2 , 
COUNT(DISTINCT(CASE WHEN a.trn_start_date BETWEEN '2013-02-01' AND '2013-06-30' THEN a.ind_id END)) AS y1 
FROM 
(
    SELECT g.ind_district_id,g.ind_id,t.trn_start_date,t.trn_beneficiary_type 
    FROM wi_individual_g g 
    INNER JOIN wi_indv_training wit ON g.ind_id = wit.ind_id AND wit.is_deleted = 0 AND g.ind_deleted=0 AND g.ind_is_recepient='yes' 
    INNER JOIN wi_training t ON wit.trn_id = t.trn_id AND t.trn_beneficiary_type=2 AND t.trn_deleted = 0 
) a 
INNER JOIN wi_individual_p p ON p.ind_id=a.ind_id 
INNER JOIN wi_district d ON d.dst_id=a.ind_district_id 
WHERE p.ind_dalit='yes' 
GROUP BY d.dst_name; 

作爲一個整體,其性能已經從我以前的查詢增加了6倍。謝謝你的建議@DRapp

如果有最好的解決辦法任何人提高性能,我想感謝他!

0

我稍微改變了查詢,把調整標準各 加入,或到WHERE子句適用。我還將「ind_dalit = yes」 組件添加到與每個case語句中的wi_individual_p表連接的JOIN中。

有了這個,我可以更好地看到標準提供指標的建議,包括

table    index 
wi_individual_g (ind_is_recipient, ind_deleted, ind_id, ind_district_id) 
wi_individual_p (ind_id, ind_dalit) 
wi_district  (dst_id, dst_name) 
wi_indv_training (ind_id, is_deleted) 
wi_training  (trn_beneficiary_type, trn_deleted, trn_start_date, trn_id) 

SELECT 
     d.dst_name, 
     COUNT(DISTINCT(CASE WHEN t.trn_start_date 
     BETWEEN '2017-07-01' AND '2018-06-30' 
     THEN g.ind_id END)) AS y6, 
     COUNT(DISTINCT(CASE WHEN t.trn_start_date 
     BETWEEN '2016-07-01' AND '2017-06-30' 
     THEN g.ind_id END)) AS y5, 
     COUNT(DISTINCT(CASE WHEN t.trn_start_date 
     BETWEEN '2015-07-01' AND '2016-06-30' 
     THEN g.ind_id END)) AS y4, 
     COUNT(DISTINCT(CASE WHEN t.trn_start_date 
     BETWEEN '2014-07-01' AND '2015-06-30' 
     THEN g.ind_id END)) AS y3, 
     COUNT(DISTINCT(CASE WHEN t.trn_start_date 
     BETWEEN '2013-07-01' AND '2014-06-30' 
     THEN g.ind_id END)) AS y2, 
     COUNT(DISTINCT(CASE WHEN t.trn_start_date 
     BETWEEN '2013-02-01' AND '2013-06-30' 
     THEN g.ind_id END)) AS y1 
    FROM 
     wi_individual_g g 
     INNER JOIN wi_individual_p p 
      ON g.ind_id = p.ind_id 
      AND p.ind_dalit='yes' 
     INNER JOIN wi_district d 
      ON g.ind_district_id = d.dst_id 
     INNER JOIN wi_indv_training wit 
      ON g.ind_id = wit.ind_id 
      AND wit.is_deleted = 0 
     INNER JOIN wi_training t 
      ON wit.trn_id = t.trn_id 
      AND t.trn_beneficiary_type = 2 
      AND t.trn_deleted = 0 
      AND t.trn_start_date >= '2013-02-01' 
      AND t.trn_start_date <= '2018-06-30' 
    WHERE 
      g.ind_is_recepient = 'yes' 
     AND g.ind_deleted = 0 
    GROUP BY 
     d.dst_name 

這裏是你可以嘗試另一種選擇。這預查詢(別名PQ)不同的「g」區和ind_id每個日期組1-6與返回每個日期記錄。那麼結果是每個地區簡單的總和。

SELECT 
     d.dst_name, 
     SUM(PQ.DateGrp = 6) AS y6, 
     SUM(PQ.DateGrp = 5) AS y5, 
     SUM(PQ.DateGrp = 4) AS y4, 
     SUM(PQ.DateGrp = 3) AS y3, 
     SUM(PQ.DateGrp = 2) AS y2, 
     SUM(PQ.DateGrp = 1) AS y1 
    FROM 
     (select distinct 
       g.ind_district_id, 
       g.ind_id, 
       CASE WHEN t.trn_start_date BETWEEN '2017-07-01' AND '2018-06-30' THEN 6 
        WHEN t.trn_start_date BETWEEN '2016-07-01' AND '2017-06-30' THEN 5 
        WHEN t.trn_start_date BETWEEN '2015-07-01' AND '2016-06-30' THEN 4 
        WHEN t.trn_start_date BETWEEN '2014-07-01' AND '2015-06-30' THEN 3 
        WHEN t.trn_start_date BETWEEN '2013-07-01' AND '2014-06-30' THEN 2 
        WHEN t.trn_start_date BETWEEN '2013-02-01' AND '2013-06-30' THEN 1 
        ELSE 0 END DateGrp 
      from 
       wi_training t 
       JOIN wi_indv_training wit 
        ON t.trn_id = wit.trn_id 
        AND wit.is_deleted = 0 
        JOIN wi_individual_g g 
          g.ind_is_recepient = 'yes' 
         AND g.ind_deleted = 0 
         AND wit.ind_id = g.ind_id 
         INNER JOIN wi_individual_p p 
         ON g.ind_id = p.ind_id 
         AND p.ind_dalit='yes' 
      where 
        t.trn_beneficiary_type = 2 
       AND t.trn_deleted = 0 
       AND t.trn_start_date >= '2013-02-01' 
       AND t.trn_start_date <= '2018-06-30') PQ 
     INNER JOIN wi_district d 
      ON PQ.ind_district_id = d.dst_id 
     GROUP BY 
     d.dst_name 
+0

我使用了您提供的查詢,並在表格上應用了一些缺失的索引。但它仍然需要78秒才能執行。我想知道如果我在應用查詢時犯了錯誤。其他企業應用程序如何應用複雜查詢來最短時間獲取結果? – 2014-12-11 04:31:10

+0

@SujitBaniya,我發佈了另一個可選的查詢嘗試... – DRapp 2014-12-11 12:32:04

+0

感謝您的努力。我已經嘗試了你提供的約79秒的建議 – 2014-12-12 02:18:28