2014-04-04 49 views
0

我在這是應該在以下方式使用nextLine()方法在掃描儀類中讀取文件的字符串:nextLine()掃描儀的java不能正常工作(可能的,因爲統一的)

some_string = "All the staff in the operating room has been specifically trained with a theoretical and practical 20-hour course.\xe2\x80\xa9Results: The overall average incidence of adverse events reported was determined by 4.8%, is consistent with the expectations of the study protocol, and is at a lower level than the average median rate of international studies (8.9%).\n" 

我以下列方式創建掃描對象:

Scanner br = new Scanner(new File("location of my file"), "UTF-8"); 

然後我做得到下一行:

while (br.hasNextLine()) { 
     System.out.println(br.nextLine()); 
} 

我得到:

>All the staff in the operating room has been specifically trained with a theoretical and practical 20-hour course. 
>Results: The overall average incidence of adverse events reported was determined by 4.8%, is consistent with the expectations of the study protocol, and is at a lower level than the average median rate of international studies (8.9%). 

看起來nextLine()在有非ASCII字符時失敗。任何想法爲什麼發生這種情況

+0

你確定該文件被編碼爲UTF-8嗎? –

+0

@DavidWallace是的。經過進一步的思考,我注意到'\ xe2 \ x80 \ xa9'的序列是來自這裏http://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128&utf8的某種形式的段落分割器=字符串文字 – kolonel

+0

@DavidWallace關於如何避免任何不是新行字符的任何想法? – kolonel

回答

1

試試這個:

Scanner scanner = new Scanner(new File("the file"), "UTF-8").useDelimiter("\n"); 

    while (scanner.hasNext()) 
     System.out.println(scanner.next()); 
+0

謝謝,但next()方法返回一個沒有這樣的元素異常,任何想法? – kolonel

+0

在同一行上?或者這是來自文件的不同部分? – Scott

+0

在整個文件中。 – kolonel

0

我打現在這個問題,遺憾的是掃描儀不使用非ASCII字符的工作,所以,當它到達它充當文件中的非ASCII字符結束。這就是爲什麼hasNext或hasNextLine返回false的原因! 您可以更改方法並使用BufferedReader來讀取文件。

BufferedReader br = new BufferedReader(new FileReader(file)); 
String line; 
while ((line = br.readLine()) != null) { 
    System.out.println(line); 
}