2014-09-03 61 views
2

最少的代碼示例如下:使用regex.h時內存泄漏?

#include <cstdlib> 
#include <iostream> 
#include <vector> 
#include <regex.h> 

using namespace std; 

class regex_result { 
public: 
    /** Contains indices of starting positions of matches.*/ 
    std::vector<int> positions; 
    /** Contains lengths of matches.*/ 
    std::vector<int> lengths; 
}; 

regex_result match_regex(string regex_string, const char* string) { 
    regex_result result; 
    regex_t* regex = new regex_t; 
    regcomp(regex, regex_string.c_str(), REG_EXTENDED); 
    /* "P" is a pointer into the string which points to the end of the 
     previous match. */ 
    const char* pointer = string; 
    /* "n_matches" is the maximum number of matches allowed. */ 
    const int n_matches = 10; 
    regmatch_t matches[n_matches]; 
    int nomatch = 0; 
    while (!nomatch) { 
     nomatch = regexec(regex, pointer, n_matches, matches, 0); 
     if (nomatch) 
      break; 
     for (int i = 0; i < n_matches; i++) { 
      int start, 
       finish; 
      if (matches[i].rm_so == -1) { 
       break; 
      } 
      start = matches[i].rm_so + (pointer - string); 
      finish = matches[i].rm_eo + (pointer - string); 
      result.positions.push_back(start); 
      result.lengths.push_back(finish - start); 
     } 
     pointer += matches[0].rm_eo; 
    } 
    delete regex; 
    return result; 
} 

int main(int argc, char** argv) { 
    string str = "this is a test"; 
    string pat = "this"; 
    regex_result res = match_regex(pat, str.c_str()); 
    cout << res.positions.size() << endl; 
    return 0; 
} 

所以我寫了解析給定的字符串爲正則表達式匹配的功能。結果保存在一個基本上是兩個向量的類中,一個用於匹配的位置,另一個用於相應的匹配長度。

這工作正常,但是當我跑valgrind,它顯示了一些大量的內存泄漏。

在使用上的代碼valgrind --leak-check=full上面我得到:

==24843== Memcheck, a memory error detector 
==24843== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. 
==24843== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info 
==24843== Command: ./test 
==24843== 
1 
==24843== 
==24843== HEAP SUMMARY: 
==24843==  in use at exit: 11,688 bytes in 37 blocks 
==24843== total heap usage: 54 allocs, 17 frees, 12,868 bytes allocated 
==24843== 
==24843== 256 bytes in 1 blocks are definitely lost in loss record 14 of 18 
==24843== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==24843== by 0x543549A: regcomp (regcomp.c:487) 
==24843== by 0x400ED0: match_regex(std::string, char const*) (in <path>) 
==24843== by 0x4010CA: main (in <path>) 
==24843== 
==24843== 11,432 (224 direct, 11,208 indirect) bytes in 1 blocks are definitely lost in  loss record 18 of 18 
==24843== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==24843== by 0x4C2CF1F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==24843== by 0x5434BAF: re_compile_internal (regcomp.c:760) 
==24843== by 0x54354FF: regcomp (regcomp.c:506) 
==24843== by 0x400ED0: match_regex(std::string, char const*) (in <path>) 
==24843== by 0x4010CA: main (in <path>) 
==24843== 
==24843== LEAK SUMMARY: 
==24843== definitely lost: 480 bytes in 2 blocks 
==24843== indirectly lost: 11,208 bytes in 35 blocks 
==24843==  possibly lost: 0 bytes in 0 blocks 
==24843== still reachable: 0 bytes in 0 blocks 
==24843==   suppressed: 0 bytes in 0 blocks 
==24843== 
==24843== For counts of detected and suppressed errors, rerun with: -v 
==24843== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) 

是我的代碼錯誤或是否真的在這些文件中的錯誤?

回答

4

您的regex_t管理不需要是動態的,儘管這與您的問題沒有直接關係,但有點奇怪。真正的問題是你永遠不會regfree()如果編譯成功(你應該驗證)你的結果表達。您應該設置你的正則表達式是這樣的:

regex_t regex; 
int res = regcomp(&regex, regex_string.c_str(), REG_EXTENDED); 
if (res == 0) 
{ 
    // use your expression via &regex 
    .... 

    // and eventually free it when done. 
    regfree(&regex); 
} 

如果您的實施支持他們,我強烈提醒使用C++ 11提供<regex>庫,因爲它有很好的RAII解決方案,這在很大程度上。

+0

啊,謝謝。我選擇了你的答案,即使你稍晚一點,因爲你提供了額外的信息。 – kunterbunt 2014-09-03 15:35:05

+0

目前,C++ 11不是一種選擇,因此我正在這樣做。 – kunterbunt 2014-09-03 15:45:52

2

您必須致電regfree()以釋放由regcomp()分配的內存。