2
$cat data.csv ID,State,City,Price,Flag 1,CA,A,95,0 2,CA,A,96,1 3,CA,A,195,1 4,NY,B,124,0 5,NY,B,128,1 6,NY,C,24,0 7,NY,C,27,1 8,NY,C,29,0 9,NY,C,39,1
預期結果:
ID0, ID1
1,2
4,5
6,7
8,7
用於與標誌= 0以上,我們想找到從標誌= 1另一ID,具有相同的各ID「州「和」城市「,以及最近的價格。
我有兩個粗糙的餿主意:
方法1.
Use a left outer join with the table itself on
(a.State=b.State and a.City=b.city and a.Flag=0 and b.Flag=1),
where a.Flag=0 and b.Flag=1,
and then use RANK() over (partitioned by a.State,a.City order by a.Price - b.Price) as rank
where rank=1
方法2
Use a left outer join with the table itself,
on
(a.State=b.State and a.City=b.city and a.Flag=0 and b.Flag=1),
where a.Flag=0 and b.Flag=1,
and then Use Distribute by a.State,a.City Sort by Price_Diff ASC limit 1
什麼是找到蜂巢近鄰的最佳方式? 任何有價值的提示將不勝感激!