位置在列表中的位置

我有一個未排序的數組，我需要位數的位數。我知道有幾種算法可以計算O（n）中給定數組的中值，但所有這些算法都包括某種數組的重新排序，就像中位數和隨機選擇一樣。位置在列表中的位置

我對他的中位數本身不感興趣，只有它在陣列中的位置纔對我感興趣。

有沒有什麼辦法可以在O（n）中做到這一點？跟蹤所有掉期將產生巨大的開銷，所以我正在尋找另一種解決方案。

來源

2013-05-28 VoodooCode

中位數不必在輸入中。例如：[1,1,2,10]的中位數爲1.5 – leemes

要清楚：您想在O（n）中找到中位數而不修改列表？你不能複製？ – leonbloy

@leonbloy（正確，無視...） – 2013-05-28 17:30:24

比方說，你有一組數據，你想找到它的中位數：

double data[MAX_DATA] = ...

創建索引的數組，每個索引初始化到自己的位置，就像這樣：

int index[MAX_DATA]; 
for (int i = 0 ; i != MAX_DATA ; i++) { 
    index[i] = i; 
}

現在實現線性的中值算法有以下變化：

當原始算法比較data[i]到data[j]，用data[index[i]]到data[index[j]]
的比較代替當原始算法互換data[i]和data[j]，交換index[i]和index[j]代替。

由於data的元件保持在它們的位置所有的時間，修改後的算法將產生的中間的位置與未修飾的陣列中，而不是它的一些元件在陣列中的位置移動到不同的斑點。

在C++中，你可以用指針來代替指標實現這一點，和指針的容器上使用std::nth_element，像這樣：

vector<int> data = {1, 5, 2, 20, 10, 7, 9, 1000}; 
vector<const int*> ptr(data.size()); 
transform(data.begin(), data.end(), ptr.begin(), [](const int& d) {return &d;}); 
auto mid = next(ptr.begin(), data.size()/2); 
nth_element(ptr.begin(), mid, ptr.end(), [](const int* lhs, const int* rhs) {return *lhs < *rhs;}); 
ptrdiff_t pos = *mid - &data[0]; 
cout << pos << endl << data[pos] << endl;

這裏是一個link to a demo on ideone。

來源

2013-05-28 17:34:23 dasblinkenlight

爲什麼不使用帶有lambda的'std :: nth_element'來比較這個索引數組上的原始數據？ – TemplateRex

@rhalbersma你是對的，我忘了提供比較器的覆蓋！我編輯了答案以反映您的評論。謝謝！ – dasblinkenlight

這種方法真的線性嗎？它看起來並不像它，因爲它使用了兩個標記。 – Xale

有一個O（n log n）算法用於跟蹤無限數字流的中位數。（因爲你不想改變列表，所以你可以把它當作一個流。）這個算法涉及兩個堆;一個總是指向下半部分的最大數量，另一個指向上半部分的最小數量。算法在這裏解釋：http://www.ardendertat.com/2011/11/03/programming-interview-questions-13-median-of-integer-stream/。你可以使用相同的代碼，最小的定製。

來源

2013-05-28 17:43:35

這裏的工作實施例中，生成索引的輔助陣列，發現輸入數組的通過std::nth_element並且平均的間接比較

#include <algorithm> 
#include <string> 
#include <vector> 
#include <iostream> 
#include <iterator> 

int main() 
{ 
    // input data, big and expensive to sort or copy 
    std::string big_data[] = { "hello", "world", "I", "need", "to", "get", "the", "median", "index" };  

    auto const N = std::distance(std::begin(big_data), std::end(big_data)); 
    auto const M = (N - 1)/2; // 9 elements, median is 4th element in sorted array 

    // generate indices 
    std::vector<int> indices; 
    auto value = 0; 
    std::generate_n(std::back_inserter(indices), N, [&](){ return value++; }); 

    // find median of input array through indirect comparison and sorting 
    std::nth_element(indices.begin(), indices.begin() + M, indices.end(), [&](int lhs, int rhs){ 
     return big_data[lhs] < big_data[rhs]; 
    }); 
    std::cout << indices[M] << ":" << big_data[indices[M]] << "\n"; 

    // check, sort input array and confirm it has the same median 
    std::sort(std::begin(big_data), std::end(big_data)); 
    std::cout << M << ":" << big_data[M] << "\n"; 
}

在線output。

該算法保證了O(N)複雜性，因爲它是std::generate_n和std::nth_element，這兩者都是在它們的輸入數據O(N)總和。

來源

2013-05-28 17:56:43 TemplateRex

位置在列表中的位置

回答

相關問題