2014-09-05 57 views
5

查詢:優化長的查詢在一個巨大的表大小33M行

SELECT users.id as uid, name, avatar, avatar_date, driver, messages.id AS mid,messages.msg, messages.removed, messages.from_anonym_id, messages.t 
    o_anonym_id, (messages.date DIV 1000) AS date, from_id = 162077 as outbox, !(0 in (SELECT read_state FROM messages as msgs 
WHERE (msgs.from_id = messages.from_id or msgs.from_id = messages.user_id) and msgs.user_id = 162077 and removed = 0)) as read_state 
FROM dialog, messages, users 
WHERE messages.id = dialog.mid and ((uid1 = 162077 and users.id = uid2) or (uid2 = 162077 and users.id = uid1)) 
ORDER BY dialog.mid DESC LIMIT 0, 101; 

表結構:

mysql> desc messages; 
+----------------+------------------+------+-----+---------+----------------+ 
| Field   | Type    | Null | Key | Default | Extra   | 
+----------------+------------------+------+-----+---------+----------------+ 
| id    | int(11)   | NO | PRI | NULL | auto_increment | 
| from_id  | int(11)   | NO | MUL | NULL |    | 
| user_id  | int(11)   | NO | MUL | NULL |    | 
| group_id  | int(11)   | NO |  | NULL |    | 
| to_number  | varchar(30)  | NO | MUL | NULL |    | 
| msg   | text    | NO |  | NULL |    | 
| image   | varchar(20)  | NO |  | NULL |    | 
| date   | bigint(20)  | NO |  | NULL |    | 
| read_state  | tinyint(1)  | NO |  | 0  |    | 
| removed  | tinyint(1)  | NO | MUL | NULL |    | 
| from_anonym_id | int(10) unsigned | NO | MUL | NULL |    | 
| to_anonym_id | int(10) unsigned | NO | MUL | NULL |    | 
+----------------+------------------+------+-----+---------+----------------+ 

mysql> desc dialog; 
+----------------+------------------+------+-----+---------+----------------+ 
| Field   | Type    | Null | Key | Default | Extra   | 
+----------------+------------------+------+-----+---------+----------------+ 
| id    | int(11)   | NO | PRI | NULL | auto_increment | 
| uid1   | int(11)   | NO | MUL | NULL |    | 
| uid2   | int(11)   | NO | MUL | NULL |    | 
| mid   | int(11)   | NO | MUL | NULL |    | 
| from_anonym_id | int(10) unsigned | NO | MUL | NULL |    | 
| to_anonym_id | int(10) unsigned | NO | MUL | NULL |    | 
+----------------+------------------+------+-----+---------+----------------+ 


mysql> show index from messages; 
+----------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
| Table | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | 
+----------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
| messages |   0 | PRIMARY  |   1 | id    | A   | 42944290 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | user_id_2  |   1 | user_id  | A   |  2147214 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | user_id_2  |   2 | read_state  | A   |  2862952 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | user_id_2  |   3 | removed  | A   |  2862952 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | from_id  |   1 | from_id  | A   |  825851 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | from_id  |   2 | to_number  | A   |  825851 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | to_number  |   1 | to_number  | A   |   29 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | idx_user_id |   1 | user_id  | A   |  2044966 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | idx_from_id |   1 | from_id  | A   |  447336 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | removed  |   1 | removed  | A   |   29 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | from_anonym_id |   1 | from_anonym_id | A   |   29 |  NULL | NULL |  | BTREE  |   |    | 
| messages |   1 | to_anonym_id |   1 | to_anonym_id | A   |   29 |  NULL | NULL |  | BTREE  |   |    | 
+----------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
12 rows in set (0.01 sec) 

mysql> show index from dialog; 
+--------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
| Table | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | 
+--------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 
| dialog |   0 | PRIMARY  |   1 | id    | A   |  6378161 |  NULL | NULL |  | BTREE  |   |    | 
| dialog |   1 | uid1   |   1 | uid1   | A   |  455582 |  NULL | NULL |  | BTREE  |   |    | 
| dialog |   1 | uid1   |   2 | uid2   | A   |  6378161 |  NULL | NULL |  | BTREE  |   |    | 
| dialog |   1 | uid2   |   1 | uid2   | A   |  2126053 |  NULL | NULL |  | BTREE  |   |    | 
| dialog |   1 | idx_mid  |   1 | mid   | A   |  6378161 |  NULL | NULL |  | BTREE  |   |    | 
| dialog |   1 | from_anonym_id |   1 | from_anonym_id | A   |   17 |  NULL | NULL |  | BTREE  |   |    | 
| dialog |   1 | to_anonym_id |   1 | to_anonym_id | A   |   17 |  NULL | NULL |  | BTREE  |   |    | 
+--------+------------+----------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 

PS請不要告訴我任何理論配方,只有實際的例子。 Thx提前。

如果我刪除此聲明

!(0 in (SELECT read_state FROM messages as msgs 
WHERE (msgs.from_id = messages.from_id or msgs.from_id = messages.user_id) and msgs.user_id = 162077 and removed = 0)) as read_state 

查詢工作得非常好相比原: 101行中集(0.04秒)

我想這是主要的問題,但我需要這個場外。 可能有人可以把這一輪,使其更快,會很高興。

+0

查詢本身運行速度有多快,沒有刪除語句? – kevin628 2014-09-05 16:13:51

+0

它可能取決於負載平均因子,但現在 '101行(5.59秒)' 如果系統是冷靜的 - 2-3秒,這肯定有問題。 – 2014-09-05 16:28:18

+1

如果'messages'表是其中有3300萬條記錄的表,那麼內部查詢 - 您爲提高性能而刪除的 - 正在對外部查詢中的每個項執行3300萬條記錄的表掃描(其中限於102條記錄)。所以102 * 33萬是很多記錄要掃描的。如果可能,請考慮使用組合鍵索引。否則,您可能會考慮重新考慮數據的相關方式,以便您可以使用組合鍵索引。 – kevin628 2014-09-05 16:36:31

回答

3

我會從消息表上的索引開始。一個複合索引來幫助覆蓋連接,正如我在下面的示例查詢中所示:索引(user_id,removed,read_state,from_id)。

接下來,解釋我的過程。我正在從對話表中作爲一個UNION進行初步查詢,但是每一個都分別爲鏈接到用戶表的下一個循環獲取「LinkToUser」的相反ID,而對於where子句中的「OR」連接結果,則是一次。預先獲得合格的記錄並簡化可能會幫助您。

下一部分是索引將用於您的消息。我正在做一個基於特定用戶的left-join,removed = 0,並且特別指出read_state = 0。通過使用索引,它可以找到匹配或不匹配。所以你的選擇的字段子句(!0 in ...)簡化爲IS NULL檢查。

SELECT 
     u.id as uid, 
     u.name, 
     avatar, 
     avatar_date, 
     driver, 
     m.id AS mid, 
     m.msg, 
     m.removed, 
     m.from_anonym_id, 
     m.to_anonym_id, 
     (m.date DIV 1000) AS date, 
     from_id = 162077 as outbox, 
     msgFrom.from_id IS NULL as read_state 
    FROM 
     (select distinct d1.*, d1.uid2 as LinkToUser 
      from dialog d1 
      where d1.uid1 = 162077 
     union select d2.*, d2.uid1 as LinkToUser 
      from dialog d2 
      where d2.uid2 = 162077) Qualified 

     JOIN Users u 
      ON Qualified.LinkToUser = u.id 

     JOIN Messages m 
      ON Qualified.mid = m.id 

      LEFT JOIN Messages msgFrom 
       ON msgFrom.user_id = 160277 
       AND msgFrom.Removed = 0 
       AND msgFrom.Read_State = 0 
       AND ( m.from_id = msgFrom.from_id 
        OR m.user_id = msgFrom.from_id) 

    ORDER BY 
     Qualified.mid DESC 
    LIMIT 
     0, 101; 

你可能需要用它玩了一下,也許更改爲類似..

if(msgFrom.from_id IS NULL, 0, msgFrom.read_state) as Read_State 

澄清

Zeusakm,您的書面只會返回一個的read_state各個領域1或0,因爲它是所選消息列表中非NOT值爲零的邏輯條件。它不會像您在評論中指出的那樣返回-1。我的版本做同樣的事情。如果它找到一個零,則返回零。如果找不到零,則返回1,因爲比較值將爲NULL,因此「IsThisValue IS NULL」返回true,這與標誌1相同。

所以,希望能夠澄清我在爲您做的左連接。顯式查找用戶ID,刪除狀態和讀取狀態以及(來自或用戶ID匹配)。

+0

+1對於嘗試,但對不起 - 有點龐大的解決方案,例如:我可以得到鍋爐板,並選擇他們的結果,然後定義數組中的值是否有1值的任何元素 - 它會更多優美。謝謝。 – 2014-09-05 17:01:27

+0

不,主要問題還沒有解決 - '!(0 in(SELECT read_state FROM messages as msgs WHERE(msgs.from_id = messages.from_id or msgs.from_id = messages.user_id)and msgs.user_id = 162077 and removed = 0))爲read_state',儘管thx爲try。 – 2014-09-06 06:22:28

+0

@zeusakm,沒有看到數據,真正理解你的!(0 in ...),我不能做更多 – DRapp 2014-09-09 00:48:15

3

這是您的查詢join語法固定和表別名添加了對外部查詢的表:

SELECT u.id as uid, name, avatar, avatar_date, driver, m.id AS mid, m.msg, 
     m.removed, m.from_anonym_id, m.t 
     o_anonym_id, (m.date DIV 1000) AS date, from_id = 162077 as outbox, 
     !(0 in (SELECT read_state 
       FROM messages m2 
       WHERE (m2.from_id = m.from_id or m2.from_id = m.user_id) and 
        m2.user_id = 162077 and removed = 0 
      ) 
     ) as read_state 
FROM dialog d join 
    messages m 
    on m.id = d.mid join 
    users u 
    on (uid1 = 162077 and users.id = uid2) or 
     (uid2 = 162077 and users.id = uid1) 
ORDER BY d.mid DESC 
LIMIT 0, 101; 

如果查詢工作以及沒有select子句中的子查詢,我會建議更換該。 in可能是一個昂貴的運營商,特別是or的條件。因此,我建議用替換它:

(case when exists (select 1 
        from messages m2 
        where m2.user_id = 162077 and m2.removed = 0 and 
          m2.from_id = m.from_id and m2.read_state = 0 
        ) 
     then 0 
     when exists (select 1 
        from messages m2 
        where m2.user_id = 162077 and m2.removed = 0 and 
          m2.from_id = m.user_id and m2.read_state = 0 
        ) 
     then 0 
     else 1 
    end) 

而且,你要對messages(from_id, user_id, removed, read_state)的索引。

+0

不幸的是,來自您帖子的上方查詢執行時間爲4.5秒,較低的修正(嵌套查詢)爲5.6秒。似乎比原來更糟,對不起。 Thx嘗試。 – 2014-09-08 06:06:07

+1

@zeusakm。 。 。你有'消息'適當的索引? – 2014-09-08 14:58:56

+0

當然,我在運行查詢之前設置了'messages(from_id,user_id,removed,read_state)'索引。 – 2014-09-09 05:49:09

2

創建一個臨時表並插入除readstate以外的所有列,默認值爲-1,並且還存儲form_id 更新類似於Gordon的帖子的readstate列。

CREATE TEMPORARY TABLE userTable 
SELECT u.id as uid, name, avatar, avatar_date, driver, m.id AS mid, m.msg, 
     m.removed, m.from_anonym_id, m.t 
     o_anonym_id, (m.date DIV 1000) AS date, from_id = 162077 as outbox, 
     m.form_id, 
     -1 as read_state 
FROM dialog d join 
    messages m 
    on m.id = d.mid join 
    users u 
    on (uid1 = 162077 and users.id = uid2) or 
     (uid2 = 162077 and users.id = uid1) 
ORDER BY d.mid DESC 
LIMIT 0, 101; 

update userTable set readstate = 
(case when exists (select 1 
        from messages m2 
        where m2.user_id = 162077 and m2.removed = 0 and 
          m2.from_id = userTable.from_id and m2.read_state = 0 
        ) 
     then 0 
     when exists (select 1 
        from messages m2 
        where m2.user_id = 162077 and m2.removed = 0 and 
          m2.from_id = userTable.uid and m2.read_state = 0 
        ) 
     then 0 
     else 1 
    end)