2017-07-25 74 views
3

我有一個包含字符串作爲關鍵字的地圖;這些字符串類似通配符。高效地查找通配符條目

一個密鑰末尾可以有一個*,這意味着當執行查找時,以此密鑰作爲前綴的字符串應與此密鑰匹配。

如何有效地檢索這樣的地圖中最接近的匹配條目?

我想在一個自定義的方式排序的映射條目,然後使用lower_bound,但排序不產生正確的結果:

#include <map> 
#include <string> 
#include <iostream> 
#include <algorithm> 

struct Compare { 
    bool operator()(const std::string& lhs, const std::string& rhs) const 
    { 
     if (lhs.size() < rhs.size()) { 
      return true; 
     } 

     if (lhs.size() > rhs.size()) { 
      return false; 
     } 

     bool isWildcardlhsAtEnd = (!lhs.empty() && lhs.back() == '*'); 
     bool isWildcardrhsAtEnd = (!rhs.empty() && rhs.back() == '*'); 

     if (isWildcardlhsAtEnd && isWildcardrhsAtEnd) { 
      return lhs < rhs; 
     } 
     auto lhSubString = lhs.substr(0, lhs.size() - 1); 
     auto rhsSubString = rhs.substr(0, rhs.size() - 1); 

     if (isWildcardlhsAtEnd || isWildcardrhsAtEnd) { 
      if (lhSubString == rhsSubString) { 
       return !isWildcardlhsAtEnd; 
      } 
      else { 
       return lhSubString < rhsSubString; 
      } 
     } 

     return lhs < rhs; 
    } 
}; 

template <typename Map> 
void lookup(const Map& map, const std::string& key, int expected) 
{ 
    auto it = map.lower_bound(key); 
    if (it != map.end()) { 
     std::cout << "found " << it->first << " for " << key << "; "; 
     std::cout << "expected: " << expected << " got: " << it->second << std::endl; 
    } 
    else { 
     std::cout << "did not find a match for " << key << std::endl; 
    } 
} 

int main() 
{ 
    std::map<std::string, int, Compare> map = { 
     { "bar", 1 }, 
     { "bar*", 2 }, 
     { "foo1", 3 }, 
     { "bar1", 4 }, 
     { "bar1*", 5 }, 
     { "foo1*", 6 }, 
     { "bar12", 7 }, 
     { "bar12*", 8 }, 
     { "foo12", 9 }, 
     { "bar123", 10 }, 
     { "b*", 11 }, 
     { "f*", 12 }, 
     { "b", 13 }, 
     { "f", 14 } 
    }; 

    std::cout << "sorted map \n------" << std::endl; 
    std::for_each(map.begin(), map.end(), [](const auto& e) { std::cout << e.first << std::endl; }); 
    std::cout << "-------" << std::endl; 

    lookup(map, "foo1", 3); 
    lookup(map, "foo123", 6); 
    lookup(map, "foo", 12); 
    lookup(map, "bar1234", 8); 
} 

這將產生以下輸出,表明不正確的查詢:

sorted map 
------ 
b 
f 
b* 
f* 
bar 
bar1 
bar* 
foo1 
bar12 
bar1* 
foo12 
foo1* 
bar123 
bar12* 
------- 
found foo1 for foo1; expected: 3 got: 3 
did not find a match for foo123 
found bar1 for foo; expected: 12 got: 4 
did not find a match for bar1234 

live example

我也願意在必要時使用另一種數據結構。

+0

您不能對ab地圖進行排序。 – 2017-07-25 14:21:17

+0

我認爲你正在使用這種錯誤的數據結構,並且你可能不得不自己想出滿足要求的東西,比試圖強制執行std :: map來做一些事情並不是真的設計編輯。 –

+0

@Someprogrammerdude有什麼建議嗎? –

回答

0

如果您將精確搜索和通配符搜索分開,那麼自然排序對字符串可以很好地工作。這段代碼似乎產生了期望的結果(我認爲),並且效率很高。當然,單獨的地圖可以更方便地包裝。

#include <map> 
#include <string> 
#include <iostream> 
#include <algorithm> 
template <typename Map> 
void lookup(const Map& exact ,const Map& wilds, const std::string& key, int expected) 
{ 
    auto it = exact.find(key); 

    if (it == exact.end()) { // if not exact match 
     it = wilds.lower_bound(key); // do best match 
     it--; 
    } 

     std::cout << "found " << it->first << " for " << key << "; "; 
     std::cout << "expected: " << expected << " got: " << it->second << std::endl; 
} 

int main() 
{ 
    std::map<std::string, int> wilds = { 
     { "bar*", 2 }, 
     { "bar1*", 5 }, 
     { "foo1*", 6 }, 
     { "bar12*", 8 }, 
     { "b*", 11 }, 
     { "f*", 12 } 
    }; 
    std::map<std::string, int> exact = { 
     { "bar", 1 }, 
     { "foo1", 3 }, 
     { "bar1", 4 }, 
     { "bar12", 7 }, 
     { "foo12", 9 }, 
     { "bar123", 10 }, 
     { "b", 13 }, 
     { "f", 14 } 
    }; 
    lookup(exact , wilds, "foo1", 3); 
    lookup(exact , wilds,"foo123", 6); 
    lookup(exact , wilds,"foo", 12); 
    lookup(exact , wilds,"bar1234", 8); 
}