0
我想從文本中獲取句子。文本內容充滿段落,!
,.
或任何其他行分隔符。使用正則表達式我可以做到,但不需要regext庫。有沒有分離句子的C++類?從文本中刪除句子以獲取所有句子separateloy存儲在某些數據結構中
否則,另一個步驟是比較每個字符與行分隔字符。但我不知道如何用矢量做到這一點。任何幫助表示讚賞。
這去與正則表達式
#include <string>
#include <vector>
#include <iostream>
#include <iterator>
#include <boost/regex.hpp>
int main()
{
/* Input. */
std::string input = "Here is a short sentence. Here is another one. And we say \"this is the final one.\", which is another example.";
/* Define sentence boundaries. */
boost::regex re("(?: [\\.\\!\\?]\\s+" // case 1: punctuation followed by whitespace
"| \\.\\\",?\\s+" // case 2: start of quotation
"| \\s+\\\")", // case 3: end of quotation
boost::regex::perl | boost::regex::mod_x);
/* Iterate through sentences. */
boost::sregex_token_iterator it(begin(input),end(input),re,-1);
boost::sregex_token_iterator endit;
/* Copy them onto a vector. */
std::vector<std::string> vec;
std::copy(it,endit,std::back_inserter(vec));
/* Output the vector, so we can check. */
std::copy(begin(vec),end(vec),
std::ostream_iterator<std::string>(std::cout,"\n"));
return 0;
}