2013-05-13 39 views
2

我有一個問題,我已經能夠使用Stata解決問題,但現在我的數據已經增長到無法再從內存處理它的大小。我希望在MySQL中這樣做。 我正試圖計算之間的項目的曼哈頓距離組。按組計算行的絕對差異總和

SELECT * FROM exampleshares; 

+----------+-------------+-------------+ 
| item  | group  | share  | 
+----------+-------------+-------------+ 
| A  | group1  | .3   | 
| B  | group1  | .7   | 
| A  | group2  | .2   | 
| B  | group2  | .6   | 
| C  | group2  | .2   | 
| A  | group3  | .3   | 
| C  | group3  | .6   | 
+----------+-------------+-------------+ 

這個例子的曼哈頓距離是:

+----------+-------------+-------------+ 
| groupX | groupY  | M distance | 
+----------+-------------+-------------+ 
| group1 | group1  | 0   | 
| group1 | group2  | .4   | 
| group1 | group3  | 1.3   | 
| group2 | group1  | .4   | 
| group2 | group2  | 0   | 
| group2 | group3  | 1.1   | 
| group3 | group1  | 1.3   | 
| group3 | group2  | 1.1   | 
| group3 | group3  | 0   | 
+----------+-------------+-------------+ 

例如,1組之間的距離,到目前爲止,我希望它是準備爲計算工作,我已經操縱的數據group2計算爲| .3-2。+ | .7-.6 | + | 0-0.2 | = 0.4,即。股份絕對差額的總和。 如何在MySQL中執行此操作?

在我的搜索過程中,如果找到幾個解決方案來計算以前的row by group的差異,但沒有什麼特別的要求。

回答

0

我相信你將不得不使用存儲例程或其他腳本來實現這一點。這裏是一個存儲例程,它會這樣做:

delimiter // 
drop procedure if exists manhattanDistance// 
create procedure manhattanDistance (in startGroup char(32), in endGroup char(32), out manhattanDistance decimal(2,1)) 
    not deterministic 
    reads sql data 
begin 
    drop table if exists tmp_items; 
    create temporary table tmp_items as select distinct item from exampleshares; 

    select sum(abs(ifnull(es1.share, 0) - ifnull(es2.share, 0))) into manhattanDistance 
    from tmp_items ti 
    left join exampleshares es1 on es1.item = ti.item and es1.group = startGroup 
    left join exampleshares es2 on es2.item = ti.item and es2.group = endGroup; 
end// 
delimiter ; 

call manhattanDistance('group1', 'group2', @distanceBetweenGroup1And2); 
select @distanceBetweenGroup1And2;