我正在查找計算兩個直方圖之間earth mover's distance (EMD)的java代碼(或庫)。這可能是直接或間接的(例如使用匈牙利算法)。我在c/C++中發現了幾個這樣的實現(例如"Fast and Robust Earth Mover's Distances",但是我想知道是否有Java版本可用)用於計算地球移動器距離的Java代碼/庫
我將使用EMD計算來評估this paper一個科學項目我的工作。
更新
利用各種資源,我估計下面的代碼應該做的伎倆。determineMinCostAssignment是最優分配的計算由確定匈牙利算法是我將使用從http://konstantinosnedas.com/dev/soft/munkres.htm 代碼我的主要關注是計算流程:我不知道這是否正確。有人可以證實這是正確的嗎?
/**
* Determines the Earth Mover's Distance between two histogram assuming an equal distance between two buckets of a histogram. The distance between
* two buckets is equal to the differences in the indexes of the buckets.
*
* @param threshold
* The maximum distance to use between two buckets.
*/
public static double determineEarthMoversDistance(double[] histogram1, double[] histogram2, int threshold) {
if (histogram1.length != histogram2.length)
throw new InvalidParameterException("Each histogram must have the same number of elements");
double[][] groundDistances = new double[histogram1.length][histogram2.length];
for (int i = 0; i < histogram1.length; ++i) {
for (int j = 0; j < histogram2.length; ++j) {
int abs_diff = Math.abs(i - j);
groundDistances[i][j] = Math.min(abs_diff, threshold);
}
}
int[][] assignment = determineMinCostAssignment(groundDistances);
double costSum = 0, flowSum = 0;
for (int i = 0; i < assignment.length; i++) {
double cost = groundDistances[assignment[i][0]][assignment[i][1]];
double flow = histogram2[assignment[i][1]];
costSum += cost * flow;
flowSum += flow;
}
return costSum/flowSum;
}
最佳的運輸成本明顯具有線性約束的線性問題。任何線性優化庫(內部點方法都能很好地工作)會做(順便說一句,匈牙利算法在這裏做什麼?你不是在尋找整數解)。其他要搜索的關鍵字是「Monge-Kantorovich distance」,「Wasserstein distance」或「Optimal transportation」。還有一些基於凸優化的算法(您可以直接找到雙Kantorovich問題的凸共軛對phi,phi^*;最適合連續空間)。 – 2012-03-30 12:23:38