將單獨的文本文件的段落拆分爲不同的字符串

我希望將一個段落從單獨的文本文件拆分爲它們自己的字符串時提供一些建議/幫助。我到目前爲止的代碼只是計算該段中的單詞總數，但我想分割它，因此每行是1個句子，然後計算該句子/行中有多少單詞，然後將其放入其自己的數組中，以便我可以用特定的知覺/線條做其他事情。下面是我的代碼明智：將單獨的文本文件的段落拆分爲不同的字符串

#include <iostream> 
#include <string> 
#include <fstream> 

using namespace std; 
int main() 
{ 
std::ifstream inFile; 
inFile.open("Rhymes.txt", std::ios::in); 
if (inFile.is_open()) 
{ 
    string word; 
    unsigned long wordCount = 0; 

    while (!inFile.eo()) 
    { 
     inFile >> word; 
     if (word.length() > 0) 
     { 
      wordCount++; 
     } 
    } 

    cout << "The file had " << wordCount << " word(s) in it." << endl; 
} 


system("PAUSE"); 
return 0; 
}

單獨的文本文件被稱爲「Rhymes.txt」，並且包含：

Today you are You, that is truer than true. There is no one alive who is Youer than You. 
The more that you read, the more things you will know. The more that you learn, the more places you'll go. 
How did it get so late so soon? Its night before its afternoon. 
Today was good. Today was fun. Tomorrow is another one. 
And will you succeed? Yes indeed, yes indeed! Ninety-eight and three-quarters percent guaranteed! 
Think left and think right and think low and think high. Oh, the things you can think up if only you try! 
Unless someone like you cares a whole awful lot, nothing is going to get better. It's not. 
I'm sorry to say so but, sadly it's true that bang-ups and hang-ups can happen to you.

所以，第一行是自己的句子，當代碼執行它會說：

The line has 19 words in it

我也有點困惑，因爲我怎麼會這樣做。我已經看到了將句子拆分成單詞的例子，但我找不到任何我可以真正理解的與我所要求的有關的東西。

來源

2014-10-05 Mathmath

這是「獨立的」，而不是「獨立」。 – 2014-10-05 09:20:13

在假定每個空白字符恰好是一個空白字符並且沒有plenking/klemping的情況下，您可以通過std::count來計數。可以通過std::getline完成在線閱讀。

int main() 
{ 
    // Simulating the file: 
    std::istringstream inFile(
R"(Today you are You, that is truer than true. There is no one alive who is Youer than You. 
The more that you read, the more things you will know. The more that you learn, the more places you'll go. 
How did it get so late so soon? Its night before its afternoon. 
Today was good. Today was fun. Tomorrow is another one. 
And will you succeed? Yes indeed, yes indeed! Ninety-eight and three-quarters percent guaranteed! 
Think left and think right and think low and think high. Oh, the things you can think up if only you try! 
Unless someone like you cares a whole awful lot, nothing is going to get better. It's not. 
I'm sorry to say so but, sadly it's true that bang-ups and hang-ups can happen to you.)"); 

    std::vector<std::string> lines; // This vector will contain all lines. 

    for (std::string str; std::getline(inFile, str, '\n');) 
    { 
     std::cout << "The line has "<< std::count(str.begin(), str.end(), ' ')+1 <<" words in it\n"; 
     lines.push_back(std::move(str)); // Avoid the copy. 
    } 

    for (auto const& s : lines) 
     std::cout << s << '\n'; 
}

如果你需要在每一句話的量以後，保存std::pair<std::string, std::size_t>的方法來保存線和字數 - 改變循環體，以這樣的：

 std::size_t count = std::count(str.begin(), str.end(), ' ') + 1; 
     std::cout << "The line has "<<count<<" words in it\n"; 
     lines.emplace_back(std::move(str), count);

來源

2014-10-05 09:06:41 Columbo

我會寫類似：

vector<string> read_line() 
{ string line, w; 
    vector<string> words; 

    getline(cin, line); 
    stringstream ss(line); 

    while(ss >> w) 
    words.push_back(w); 

    return words; 
}

返回載體包含您所需要的信息：詞的數量和（你可以很容易地刪除與標點符號）文字本身。

vector<string> words = read_line(); 
cout << "This line has " << words.size() << " words in it" << endl;

要讀你做的所有行：

while(1) 
{ vector<string> words = read_line(); 
    if(words.size() == 0) break; 

    // process line 
}

來源

2014-10-05 12:50:14 saadtaame

將單獨的文本文件的段落拆分爲不同的字符串

回答

相關問題