2014-12-07 37 views
1

我有一個IP地址列表/數組作爲字符串。我需要確定這個數組中是否有重複項並記錄錯誤。該陣列大約20個元素。什麼是識別重複的有效方法?有效識別C++中字符串數組中重複項的算法

+0

使用的數據結構與超載而不是'=='和'<'運算符,並將IP存儲在一個集合中(最好是散列)。 – Columbo 2014-12-07 13:15:59

+1

對它們進行排序,然後檢查相鄰的對。 – 2014-12-07 13:17:20

+0

首先從字符串形式轉換可能是一個勝利。 – 2014-12-07 13:17:49

回答

2
  1. 排序原始陣列
  2. 迭代過排序後的數組,並且計數不同的值
  3. 創建具有從原始到新陣列(2)
  4. 複製值的大小新的數組,跳過重複

pseudo in bash:

[[email protected] ~]$ cat 1.txt 
1 
2 
3 
66 
1 
1 
66 
3 
7 
7 
7 
7 
26 

[[email protected] ~]$ cat 1.txt | sort | uniq 

1 
2 
26 
3 
66 
7 
[[email protected] ~]$ cat 1.txt | sort | uniq | wc -l 
     7 
+0

錯誤的問題? – 2014-12-07 13:23:14

+1

不,只是懶得寫C++代碼,而是給這個人一個大概的想法...... :) – elcuco 2014-12-07 13:25:39

+0

恐怕不會削減它。有人可能會交出一本C++書作爲答案,呵呵? – 2014-12-07 13:26:13

2

您可以使用map<string, int>標記使用的地址並在地址首次出現:

void check_dups(const std::vector<std::string>& addresses) { 
    std::map<std::string, int> seen; 
    for (int i=0,n=addresses.size(); i<n; i++) { 
     std::map<std::string, int>::iterator it = seen.find(addreses[i]); 
     if (it == seen.end()) { 
      // Never used before, mark the position 
      seen[addresses[i]] = i; 
     } else { 
      // Duplicated value, emit a warning 
      std::cout << "Duplicate address at index " << i << 
         " (present already at index " << it->second << ")\n"; 
     } 
    } 
} 
+1

不錯,但std :: map是使用紅黑樹實現的,所以插入它有複雜度O(log n)。這意味着總的來說我們會得到O(n logn),沒有比簡單地對數組進行排序並尋找相鄰的相等元素更好。 – Krystian 2014-12-07 13:25:48

+0

@Krystian如果這是你的主要關注點,只需使用'std :: unordered_map'。 – 2014-12-07 13:26:49

+0

E_net4:但您需要C++ 11支持才能使用std :: undordered_map – Krystian 2014-12-07 13:29:10

0

這裏有3點合理有效的方式,從我的頭頂:

#include <iostream> 
#include <algorithm> 
#include <string> 
#include <vector> 
#include <set> 

// returns a sorted, de-duplicated copy 
std::vector<std::string> de_duplicated(std::vector<std::string> vec) 
{ 
    std::set<std::string> interim { vec.begin(), vec.end() }; 
    vec.assign(interim.begin(), interim.end()); 
    return vec; 
} 

// sorts and de-duplicates in place 
void de_duplicate(std::vector<std::string>& vec) 
{ 
    std::sort(std::begin(vec), std::end(vec)); 

    auto current = std::begin(vec); 

    do { 
     auto last = std::end(vec); 
     current = std::adjacent_find(current, last); 
     if (current != last) { 
      auto last_same = std::find_if_not(std::next(current), 
               last, 
               [&current](const std::string& s) { 
                return s == *current; 
               }); 
      current = vec.erase(std::next(current), last_same); 
     } 
    } while(current != std::end(vec)); 

} 

// returns a de-duplicated copy, preserving order 
std::vector<std::string> de_duplicated_stable(const std::vector<std::string>& vec) 
{ 
    std::set<std::string> index; 
    std::vector<std::string> result; 
    for (const auto& s : vec) { 
     if (index.insert(s).second) { 
      result.push_back(s); 
     } 
    } 

    return result; 
} 





using namespace std; 


int main() { 

    std::vector<std::string> addresses { "d", "a", "c", "d", "c", "a", "c", "d" }; 

    cout << "before" << endl; 
    std::copy(begin(addresses), end(addresses), ostream_iterator<string>(cout, ", ")); 
    cout << endl; 

    auto deduplicated = de_duplicated(addresses); 
    cout << endl << "sorted, de-duplicated copy" << endl; 
    std::copy(begin(deduplicated), end(deduplicated), ostream_iterator<string>(cout, ", ")); 
    cout << endl; 

    deduplicated = de_duplicated_stable(addresses); 
    cout << endl << "sorted, stable copy" << endl; 
    std::copy(begin(deduplicated), end(deduplicated), ostream_iterator<string>(cout, ", ")); 
    cout << endl; 

    de_duplicate(addresses); 
    cout << endl << "sorted, de-duplicated in-place" << endl; 
    std::copy(begin(addresses), end(addresses), ostream_iterator<string>(cout, ", ")); 
    cout << endl; 

    return 0; 
} 
相關問題