2015-08-31 68 views
2

我有兩個表t1t2t1t2大小的1/10)。每個表格有兩列<Lat, Long>,其中包含一些點的經度和緯度。對於t1中的每一行,我想找到最靠近它的t2中的那一行。這樣做最有效的查詢是什麼? Hive是否有任何類型的地理空間搜索庫?找到一個拉特和長值最接近的位置

+0

只給一點點幫助下一個這樣做的人:我最終將我的數據移動到Solr,並使用其快速地理空間搜索進行查詢。 Solr在這方面做得非常快。 – Mark

回答

2

你需要做一些trig。

請參閱本文章Database Journal

的最後一個例程,我相信是你在找什麼(你需要修改它爲您的使用):

CREATE DEFINER=`root`@`localhost` PROCEDURE closest_restaurants_optimized` 
(IN units varchar(5), IN lat Decimal(9,6), IN lon Decimal(9,6), 
IN max_distance SMALLINT, IN limit_rows MEDIUMINT) 
BEGIN 
    DECLARE ONE_DEGREE_CONSTANT TINYINT; 
    DECLARE EARTH_RADIUS_CONSTANT SMALLINT; 
    DECLARE lon1, lon2, lat1, lat2 float; 
    IF units = 'miles' THEN 
     SET ONE_DEGREE_CONSTANT = 69; 
     SET EARTH_RADIUS_CONSTANT = 3959; 
    ELSE -- default to kilometers 
     SET ONE_DEGREE_CONSTANT = 111; 
     SET EARTH_RADIUS_CONSTANT = 6371; 
    END IF; 
    SET lon1 = lon-max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT); 
    SET lon2 = lon+max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT); 
    SET lat1 = lat-(max_distance/ONE_DEGREE_CONSTANT); 
    SET lat2 = lat+(max_distance/ONE_DEGREE_CONSTANT); 
    SELECT pm1.post_id, p.post_title, 
     ROUND((EARTH_RADIUS_CONSTANT * acos(cos(radians(lat)) 
      * cos(radians(pm1.meta_value)) 
      * cos(radians(pm2.meta_value) - radians(lon)) + sin(radians(lat)) 
      * sin(radians(pm1.meta_value))) 
     ), 3) AS distance 
    FROM goodfood_wp_md20m_postmeta AS pm1, 
     goodfood_wp_md20m_postmeta AS pm2, 
     goodfood_wp_md20m_posts AS p 
    WHERE pm1.meta_key = 'latitude' AND pm2.meta_key = 'longitude' 
    AND pm1.post_id = pm2.post_id 
    AND pm1.post_id = p.id 
    AND p.post_status = 'publish' 
    AND pm2.meta_value between lon1 and lon2 
    AND pm1.meta_value between lat1 and lat2 
    ORDER BY distance ASC 
    LIMIT limit_rows; 
END 
相關問題