2016-10-12 45 views
0

對於給定的經度,緯度和半徑,我應該從500,000個數據記錄的數據庫中選擇averagePrice,numberOfListings ...。使用子查詢優化此MySQL查詢

(ID)1(SELECT_TYPE)SIMPLE(表)數據庫(分區)NULL(類型)ALL(possible_keys)NULL(鍵)NULL(key_len)NULL(參照)NULL(行)623612(過濾後)100.00 (額外)NULL

CREATE TABLE `database` (
    `id` varchar(255) DEFAULT NULL, 
    `longitude` varchar(255) DEFAULT NULL, 
    `latitude` varchar(255) DEFAULT NULL, 
    `price` int(11) DEFAULT NULL, 
    `bathrooms` int(11) DEFAULT NULL, 
    `bedrooms` int(11) DEFAULT NULL, 
    `person_capacity` int(11) DEFAULT NULL, 
    `rev_count` int(11) DEFAULT NULL, 
    KEY `hosting_id` (`hosting_id`), 
    KEY `price` (`price`) 
) ENGINE=InnoDB DEFAULT CHARSET=latin1; 

不進行分組選擇查詢。

SELECT 
    avg(price) as averagePrice, 
    count(*) as numberOfListings, 
    min(price) as minprice, 
    max(price) as maxprice, 
    avg(bedrooms) as averagebedrooms, 
    avg(bathrooms) as averagebathrooms, 
    avg(person_capacity) as averagepc, 
    avg(rev_count) as averageReviews, 
    avg(time_appartement) as averageDateHasBeenListed 
FROM 
    (SELECT 
     r.*, 
     (6371 * acos(cos(radians(37.774929)) * cos(radians(ANY_VALUE(`latitude`))) * cos(radians(ANY_VALUE(`longitude`)) - radians(-122.419416)) + sin(radians(37.774929)) * sin(radians(ANY_VALUE(`latitude`))))) AS distance   
    FROM 
     `database` r  ) r 
WHERE 
    distance <= 25 
    AND price >= 10 
ORDER BY 
    distance ASC 

這對查詢時間約1秒鐘效果很好。現在我的下一步是將子查詢與ID進行分組,併爲每個ID選擇平均價格,臥室,浴室,person_capacity,rev_count和time_appartement。

SELECT 
    avg(price) as averagePrice, 
    count(*) as numberOfListings, 
    min(price) as minprice, 
    max(price) as maxprice, 
    avg(bedrooms) as averagebedrooms, 
    avg(bathrooms) as averagebathrooms, 
    avg(person_capacity) as averagepc, 
    avg(rev_count) as averageReviews, 
    avg(time_appartement) as averageDateHasBeenListed  
FROM 
    (SELECT 
     id, 
     avg(r.price) as price, 
     avg(r.bedrooms) as bedrooms, 
     avg(r.bathrooms) as bathrooms, 
     avg(r.person_capacity) as person_capacity, 
     avg(r.rev_count) as rev_count, 
     avg(r.time_appartement) as time_appartement, 
     (6371 * acos(cos(radians(37.774929)) * cos(radians(ANY_VALUE(`latitude`))) * cos(radians(ANY_VALUE(`longitude`)) - radians(-122.419416)) + sin(radians(37.774929)) * sin(radians(ANY_VALUE(`latitude`))))) AS distance    
    FROM 
     `database` r 
    GROUP BY 
     r.id   ) r  
WHERE 
    distance <= 25 
    AND price >= 10  
ORDER BY 
    distance ASC 

它的工作原理,但問題是這個查詢的時間約爲7秒。

可以減少時間嗎? 感謝您的回覆。

+0

有關優化問題總是需要最低限度爲所有相關表格TABLE語句,和的結果EXPLAIN – Strawberry

+0

並請格式化你的查詢 – Strawberry

+0

另外,看構建邊界框用於過濾的幾何數據 – Strawberry

回答

0

您可能會將價格條件移動到子查詢。

  • WHERE r.price> = 10,如果你的價格只考慮記錄> = 10
  • HAVING AVG(r.price)> = 10(計算之前平均值),如果你考慮的平均價格> = 10

此外,還要確保你的ID和價格

0

latitudelongitude比無用爲VARCHAR(255)惡化有指標。對於家庭,這應該是不錯的:

latitude DECIMAL(6,4), 
longitude DECIMAL(7,4) 

什麼樣的id必須VARCHAR(255)

你期待成千上萬或數十億間臥室嗎?使用TINYINT UNSIGNED(1個字節,範圍0..255)以獲得更高的效率。

沒有PRIMARY KEY;這對InnoDB不利。

AVG(AVG(...))在數學上不好。

您的子查詢意味着有多行具有相同的id;到底是怎麼回事?

修復這些問題,閱讀「邊界框」,然後回來尋求更多幫助/濫用。