2017-08-21 48 views
3

我可以使用C++ std::regex該片段中提取的四個線串:匹配上不同的以C線的數目++的std :: regex_replace

std::regex table("(<table id.*\n.*\n.*\n.*>)"); 
    const std::string format="$&"; 
    std::cout << 
    std::regex_replace(tidy_string(/* */) 
     ,table 
     ,format 
     ,std::regex_constants::format_no_copy 
     |std::regex_constants::format_first_only 
     ) 
    << '\n'; 

tidy_string()返回std::string和代碼產生以下輸出:

<table id="creditPolicyTable" class= 
           "table table-striped table-condensed datatable top-bold-border bottom-border" 
           summary= 
           "This table of Credit Policy gives credit information (column headings) for list of exams (row headings)."> 

如何在具有不同數量的線條而不是四條線條的文字上匹配?例如:

<table id="creditPolicyTable" summary= 
           "This table of Credit Policy gives credit information (column headings) for list of exams (row headings)."> 

或:

<table id="creditPolicyTable" 
    class="table table-striped table-condensed datatable top-bold-border bottom-border" 
    summary="This table of Credit Policy gives credit information (column headings) for list of exams (row headings)." 
more="x" 
even_more="y"> 
+0

你可以使用'(

* *>)''。這將匹配所有內容,直到第一個'>',因此會給你'
'選項卡的內容(假設裏面沒有轉義的'>'字符)。一般來說,我認爲使用正則表達式來解析XML/HTML不是最好的方法,您是否考慮過使用XML解析器(例如libxml2)? – ThePhysicist

+0

那些後來的

標籤,你的意思是寫一些像「
「? – AndyG

+0

順便說一句,上面使用的'。*'運算符是「貪婪」,即他們儘量匹配儘可能多的字符,這可能是一個問題,如果你有一個非常長的文件,裏面有很多「

」標籤 – ThePhysicist

回答

0

你應該使用std :: regex_search懶洋洋地搜索任何東西,但 '>' 字符。像這樣:

#include <iostream> 
#include <regex> 

int main() { 
    std::string lines[] = {"<table id=\"creditPolicyTable\" class=\"\ 
table table-striped -table-condensed datatable top-bold-border bottom-border\"\ 
summary=\ 
\"This table of Credit Policy gives credit information (column headings) for list of exams (row headings).\">", 
       "<table id=\"creditPolicyTable\" summary=\ 
       \"This table of Credit Policy gives credit information (column headings) for list of exams (row headings).\"\ 
       more=\"x\"\ 
       even_more=\"y\">"}; 
    std::string result; 
    std::smatch table_match; 

    std::regex table_regex("<table\\sid=[^>]+?>"); 

    for (const auto& line : lines){ 
    if (std::regex_search(line, table_match, table_regex)) { 
     for (size_t i = 0; i < table_match.size(); ++i) 
     std::cout << "Match found " << table_match[i] << '\n'; 
    } 
    } 
} 
相關問題