2017-07-31 77 views
-1

我需要幫助來創建分割代碼行然後才能進行拼寫檢查的代碼。如何將緩衝讀取器中的行分割爲單詞

public static void main(String [] args) throws IOException { 
    Stem myStem = new Stem(); 

    BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new FileInputStream("C:\\Users\\lamrh\\IdeaProjects\\untitled1\\src\\bigON\\data.txt"))); 

    //String currentWord = String.valueOf(bufferedReader.readLine()); 
    Scanner scanner = new Scanner(bufferedReader.readLine()); 
    //byte[] data = new byte [currentWord.length()]; 
    String[] splitLines; 
    //splitLines = splitLines.split(" "); 


    String line; 
    while((line = bufferedReader.readLine()) !=null ){ 
     //splitLines = line.split(" "); 
     String currentWord1 = formatWordGhizou (line); 
     System.out.println(""+ line+""+ ":"+ currentWord1); 

    } 
    bufferedReader.close(); 


} 

凡結果表明我:

سْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم 

سْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم ِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ:سماللهالرحمنالرحيم

,它應該看起來像一個字一個字不字線。 任何幫助 謝謝。

+0

你能否提供函數「formatWordGhizou()」的來源? –

+0

[爲什麼「有人可以幫我嗎?」不是一個真正的問題?](http://meta.stackoverflow.com/q/284236) – EJoshuaS

+0

問題是有什麼辦法可以將已被bufferedreader讀取的行分割成話 –

回答

-1
// format the word by removing any punctuation, diacritics and non-letter charracters 
private static String formatWordGhizou (String currentWord) 
{ 
    StringBuffer modifiedWord = new StringBuffer (); 


    // remove any diacritics (short vowels) 
    if (removeDiacritics(currentWord, modifiedWord)) 
    { 
     currentWord = modifiedWord.toString (); 
    } 

    // remove any punctuation from the word 
    if (removePunctuation(currentWord, modifiedWord)) 
    { 
     currentWord = modifiedWord.toString () ; 
    } 

    // there could also be characters that aren't letters which should be removed 
    if (removeNonLetter (currentWord, modifiedWord)) 
    { 
     currentWord = modifiedWord.toString (); 
    } 

    // check for stopwords 
    if(!checkStrangeWords (currentWord)) 
     // check for stopwords 
     if(!checkStopwords (currentWord)) 
      currentWord = stemWord (currentWord); 

    return currentWord; 
} 

//----------------- 
0

在while循環嘗試串接線串進行,使用正則表達式來填充字符串數組splitLines然後通過陣列splitLines迭代分割線發送元件到標準輸出如下(adapted from helpful tutorial at this link

String lines=""; 

while((line = bufferedReader.readLine()) !=null ){ 

    lines = lines + line; 

} 

String[] splitLines = lines.split("\\s+"); 

for (String words: splitLines) { 

    System.out.println(words); 

    } 
相關問題