C++地圖查找性能與PHP數組查找性能

我不明白下面，我希望有人能提供一些線索就可以了我：C++地圖查找性能與PHP數組查找性能

在C++中，如果我創建一個包含2M測試數據的矢量文本（TESTDATA）的不同位然後創建使用這些字符串作爲索引值的映射，然後查找所有的值，如下所示：

//Create test data 
for(int f=0; f<loopvalue; f++) 
{ 
    stringstream convertToString; 
    convertToString << f; 
    string strf = convertToString.str(); 
    testdata[f] = "test" + strf; 
} 

    time_t startTimeSeconds = time(NULL); 

    for(int f=0; f<2000000; f++) testmap[ testdata[f] ] = f; //Write to map 
    for(int f=0; f<2000000; f++) result = testmap[ testdata[f] ]; //Lookup 

    time_t endTimeSeconds = time(NULL); 
    cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << endl;

它需要10秒。

如果我這樣做似乎至少同樣在PHP中：

<?php 
$starttime = time(); 
$loopvalue = 2000000; 

//fill array 
for($f=0; $f<$loopvalue; $f++) 
{ 
    $filler = "test" . $f; 
    $testarray[$filler] = $f; 
} 

//look up array 
for($f=0; $f<$loopvalue; $f++) 
{ 
    $filler = "test" . $f; 
    $result = $testarray[$filler]; 
} 

$endtime = time(); 
echo "Time taken ".($endtime-$starttime)." seconds."; 
?>

...只需要3秒鐘。

鑑於PHP是用C編寫的，任何人都知道PHP如何實現這種快得多的文本索引查找？

感謝 Ç

來源

2011-11-09 Columbo

將string-keyed映射切換爲'std :: unordered_map '，並對其差異感到驚訝。你可以爲其他地圖做同樣的事情，來思考它。如果你沒有那個類，可以考慮''中的'std :: tr1 :: unordered_map'。 –

你爲C++代碼使用了哪些編譯器設置？特別是哪種編譯器和哪種優化級別？ – tokage

Is result = testmap [testdata [f]] ** = f **一個錯字？ –

我懷疑你是基準錯誤的東西。無論如何，我用你的代碼（必須對你的數據類型做一些假設），這裏是我的機器的結果：

PHP：花費2秒。

C++（使用std :: map）：需要3秒的時間。

C++（使用std :: tr1 :: unordered_map）：耗時1秒。

C++與

g++ -03

這裏編譯是我測試的C++代碼：

#include <map> 
#include <sstream> 
#include <string> 
#include <vector> 
#include <iostream> 
#include <tr1/unordered_map> 


int main(){ 
    const int loopvalue=2000000; 
    std::vector<std::string> testdata(loopvalue); 
    std::tr1::unordered_map<std::string, int> testmap; 
    std::string result; 
    for(int f=0; f<loopvalue; f++) 
    { 
     std::stringstream convertToString; 
     convertToString << f; 
     std::string strf = convertToString.str(); 
     testdata[f] = "test" + strf; 
    } 

    time_t startTimeSeconds = time(NULL); 

    for(int f=0; f<loopvalue; f++) testmap[ testdata[f] ] = f; //Write to map 
    for(int f=0; f<loopvalue; f++) result = testmap[ testdata[f] ]; //Lookup 

    time_t endTimeSeconds = time(NULL); 
    std::cout << "Time taken " << endTimeSeconds - startTimeSeconds << "seconds." << std::endl; 
}

結論：

您測試未經優化的C++代碼，甚至可能用VC++，默認情況下有編在調試模式下編譯時在std :: vector :: operator []中進行邊界檢查。

當我們使用std :: map時，由於查找複雜性的差異（請參閱n0rd的答案），PHP與優化的C++代碼仍然存在差異，但使用Hashmap時C++速度更快。

來源

2011-11-09 12:15:01 tokage

當然，對於基準測試，除了別的以外，還應該使用更精細的定時功能。 – tokage

你的循環是不完全等同的算法。注意的是，在C++版本你有

testmap [TESTDATA [F] - 這其實是一個查找+插入
testmap [TESTDATA [F] - 2個查找

在您剛剛插入的PHP版本在第一個循環中插入，並在第二個循環中查找。

解釋PHP - 通常如果你的代碼在PHP中速度更快，請先檢查代碼！ ;-)

來源

2011-11-09 11:26:57 Bond

好點，我明白你在說什麼。 – Columbo

根據another question，在PHP關聯陣列被實現爲hash tables，其具有平均搜索O（1）的複雜性，同時在std::map C++是與搜索複雜度O的二進制樹（log n）的，這比較慢。

來源

2011-11-09 11:54:42 n0rd

可以說，這不是哈希表和二叉搜索樹之間的理論複雜性差異，而是字符串比較非常慢的事實。哈希使得*方面效率更高。如果你對以整數爲基礎的地圖進行了相同的嘗試，我懷疑這種差異會非常大。 –

@Kerrek SB你說得對，但是你可以先把數字放在字符串比較的一段時間。 – tokage

C++地圖查找性能與PHP數組查找性能

回答

相關問題