2012-06-15 25 views
0

我想從地圖的地圖創建一個倒排索引.AT一刻,我有這樣的代碼:如何從地圖創建倒排索引以在C++中進行映射?

int main() 
{ 

    char lineBuffer[200]; 
    typedef std::map<std::string, int> MapType; 
    std::ifstream archiveInputStream("./hola"); 

    // map words to their text-frequency 
    std::map<std::string, int> wordcounts; 

    // read the whole archive... 
    while (!archiveInputStream.eof()) 
    { 
     //... line by line 
     archiveInputStream.getline(lineBuffer, sizeof(lineBuffer)); 

     char* currentToken = strtok(lineBuffer, " "); 

     // if there's a token... 
     while (currentToken != NULL) 
     { 
      // ... check if there's already an element in wordcounts to be updated ... 
      MapType::iterator iter = wordcounts.find(currentToken); 
      if (iter != wordcounts.end()) 
      { 
       // ... then update wordcount 
       ++wordcounts[currentToken]; 
      } 
      else 
      { 
       // ... or begin with a new wordcount 
       wordcounts.insert(
         std::pair<std::string, int>(currentToken, 1)); 
      } 
      currentToken = strtok(NULL, " "); // continue with next token 
     } 

     // display the content 
     for (MapType::const_iterator it = wordcounts.begin(); it != wordcounts.end(); 
       ++it) 
     { 
      std::cout << "Who(key = first): " << it->first; 
      std::cout << " Score(value = second): " << it->second << '\n'; 
     } 
    } 
} 

關於這個毛病我沒有想法,因爲我使用的地圖結構是初學者。

我非常感謝你的幫助。

+2

請更具體關於什麼幫助你真的需要,否則很難猜測 – xmoex

+0

感謝您的幫助。我需要幫助創建一個倒序索引使用該代碼的地圖映射。然後我需要創建像輸出這個詞與各自的頻率這個。 –

+0

您是否試圖創建一個由頻率索引的'map',以便您可以通過'freqm [42]'來獲取出現'42'次的單詞? – dirkgently

回答

1

我想可能幫助將是該指數創建第二個地圖中,string具有相同的單詞計數索引的索引列表,像這樣的(類似於histogram):

std::map<int, std::list<std::string> > inverted;

所以當你與創造 - 地圖,你必須每string插入手動像這樣倒排索引的wordcounts做(注意,此代碼是未經測試!):

// wordcounts to inverted index 
for (std::map<std::string, int>::iterator it = wordcounts.begin(); 
     it != wordcounts.end(); ++it) 
{ 
    int wordcountOfString = it->second; 
    std::string currentString = it->first; 

    std::map<int, std::list<std::string> >::iterator invertedIt = 
      inverted.find(wordcountOfString); 
    if (invertedIt == inverted.end()) 
    { 
     // insert new list 
     std::list<std::string> newList; 
     newList.push_back(currentString); 
     inverted.insert(
       std::make_pair<int, std::list<std::string>>(
         wordcountOfString, newList)); 
    } 
    else 
    { 
     // update existing list 
     std::list<std::string>& existingList = invertedIt->second; 
     existingList.push_back(currentString); 
    } 

} 
+0

xmoex,謝謝你的有用信息和幫助 –

+0

我是這個社區的新成員。我怎樣才能接受你的答案? –

+0

@Christian請看這個答案中的圖片:http://meta.stackexchange.com/a/5235 – anatolyg