我正在處理一個多文件程序,它接收一個文本文件,刪除打洞,然後創建每個單詞的索引,以及它出現在哪一行。代碼編譯並運行,但我得到的輸出是我不想要的。我很確定問題在於處理標點符號。每次該單詞後面跟着一個句點字符,它就會將該單詞記爲兩次,即使我排除了puncuation。然後它將最終的單詞輸出幾次,表示它存在於文件中不存在的行上。一些幫助將不勝感激!爲什麼從文本文件C++多行映射文本到行時,我會獲得額外的索引值?
輸入文件:
dogs run fast.
dogs bark loud.
cats sleep hard.
cats are not dogs.
Thank you.
#
C++代碼:
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <map>
using namespace std;
int main(){
ifstream input;
input.open("NewFile.txt");
if (!input)
{
cout << "Error opening file." << endl;
return 0;
}
multimap< string, int, less<string> > words;
int line; //int variable line
string word;//string variable word
// For each line of text, the length of input, increment line
for (line = 1; input; line++)
{
char buf[ 255 ];//create a character with space of 255
input.getline(buf, 128);//buf is pointer to array of chars where
//extracted, 128 is maximum num of chars to write to s.
// Discard all punctuation characters, leaving only words
for (char *p = buf;
*p != '\0';
p++)
{
if (ispunct(*p))
*p = ' ';
}
//
istringstream i(buf);
while (i)
{
i >> word;
if (word != "")
{
words.insert(pair<const string,int>(word, line));
}
}
}
input.close();
// Output results
multimap< string, int, less<string> >::iterator it1;
multimap< string, int, less<string> >::iterator it2;
for (it1 = words.begin(); it1 != words.end();)
{
it2 = words.upper_bound((*it1).first);
cout << (*it1).first << " : ";
for (; it1 != it2; it1++)
{
cout << (*it1).second << " ";
}
cout << endl;
}
return 0;
}
輸出:
Thank : 5
are : 4
bark : 2
cats : 3 4
dogs : 1 2 4 4
fast : 1 1
hard : 3 3
loud : 2 2
not : 4
run : 1
sleep : 3
you : 5 5 6 7
所需的輸出:
Thank : 5
are : 4
bark : 2
cats : 3 4
dogs : 1 2 4
fast : 1
hard : 3
loud : 2
not : 4
run : 1
sleep : 3
you : 5
在此先感謝您的幫助!
而當你在調試器中通過這個步驟時,你看到了什麼? –
@RichardCritten啊!出於某種原因,它在句子結尾添加了一個額外的計數。它正在做一個額外的行44'words.insert(pair(word,line));'爲什麼這樣做?它不應該停止,因爲標點已被刪除? –
cparks10