2012-01-13 94 views
0

我有幾個關於boost :: regex的問題:我嘗試了下面的一個例子。正則表達式不返回任何結果

1)什麼是sregex_token_iterator的第4個參數?它聽起來像一個「默認匹配」,但你爲什麼要這樣做,而不是什麼都不返回?我嘗試沒有第四個參數,但它不編譯。

2)我得到的輸出: (1,0) (0,0) (3,0) (0,0) (5,0)

誰能解釋一下會出錯?

#include <iostream> 
#include <sstream> 
#include <vector> 
#include <boost/regex.hpp> 

// This example extracts X and Y from (X , Y), (X,Y), (X, Y), etc. 


struct Point 
{ 
    int X; 
    int Y; 
    Point(int x, int y): X(x), Y(y){} 
}; 

typedef std::vector<Point> Polygon; 

int main() 
{ 
    Polygon poly; 
    std::string s = "Polygon: (1.1,2.2), (3, 4), (5,6)"; 

    std::string floatRegEx = "[0-9]*\\.?[0-9]*"; // zero or more numerical characters as you want, then an optional '.', then zero or more numerical characters. 
    // The \\. is for \. because the first \ is the c++ escpape character and the second \ is the regex escape character 
    //const boost::regex r("(\\d+),(\\d+)"); 
    const boost::regex r("(\\s*" + floatRegEx + "\\s*,\\s*" + floatRegEx + "\\s*)"); 
    // \s is white space. We want this to allow (2,3) as well as (2, 3) or (2 , 3) etc. 

    const boost::sregex_token_iterator end; 
    std::vector<int> v; // This type has nothing to do with the type of objects you will be extracting 
    v.push_back(1); 
    v.push_back(2); 

    for (boost::sregex_token_iterator i(s.begin(), s.end(), r, v); i != end;) 
    { 
    std::stringstream ssX; 
    ssX << (*i).str(); 
    float x; 
    ssX >> x; 
    ++i; 

    std::stringstream ssY; 
    ssY << (*i).str(); 
    float y; 
    ssY >> y; 
    ++i; 

    poly.push_back(Point(x, y)); 
    } 

    for(size_t i = 0; i < poly.size(); ++i) 
    { 
    std::cout << "(" << poly[i].X << ", " << poly[i].Y << ")" << std::endl; 
    } 
    std::cout << std::endl; 

    return 0; 
} 
+0

你嘗試libpcre? – 2012-01-13 17:37:53

+0

我不想介紹更多的依賴關係。 – 2012-01-13 20:50:07

回答

0

你的正則表達式是完全可選:

"[0-9]*\\.?[0-9]*" 

也匹配空字符串。所以"(\\s*" + floatRegEx + "\\s*,\\s*" + floatRegEx + "\\s*)"也匹配一個逗號。

你應該至少強制的事情:

"(?:[0-9]+(?:\\.[0-9]*)?|\\.[0-9]+)" 

這使得11.11..1但不.

(?:   # Either match... 
[0-9]+  # one or more digits, then 
(?:   # try to match... 
    \.   # a dot 
    [0-9]*  # and optional digits 
)?   # optionally. 
|   # Or match... 
\.[0-9]+ # a dot and one or more digits. 
)   # End of alternation 
+0

蒂姆,我以爲?只是讓點是可選的?我確實想要允許1.和.1一樣,這就是爲什麼我在小數兩邊都使用[0-9] *的原因。我該如何製作。可選的? 此外,我更新了一個更合適的解析問題。但是,我得到5個輸出,而不是我期望的3個輸出? – 2012-01-13 17:25:07

+0

好的,我編輯了我的正則表達式。稍後會添加一個解釋。 '*'也使數字可選。 – 2012-01-13 17:29:05

+0

謝謝蒂姆。這當然比讓整個事情可選更好。但是,在硬編碼輸入的情況下,我的非強大表達式仍然適用?我想我現在正在用C++做一些錯誤的事情,因爲我在原始文章中的#2中顯示的輸出? – 2012-01-13 17:33:38

相關問題