我在這是應該在以下方式使用nextLine()方法在掃描儀類中讀取文件的字符串:nextLine()掃描儀的java不能正常工作(可能的,因爲統一的)
some_string = "All the staff in the operating room has been specifically trained with a theoretical and practical 20-hour course.\xe2\x80\xa9Results: The overall average incidence of adverse events reported was determined by 4.8%, is consistent with the expectations of the study protocol, and is at a lower level than the average median rate of international studies (8.9%).\n"
我以下列方式創建掃描對象:
Scanner br = new Scanner(new File("location of my file"), "UTF-8");
然後我做得到下一行:
while (br.hasNextLine()) {
System.out.println(br.nextLine());
}
我得到:
>All the staff in the operating room has been specifically trained with a theoretical and practical 20-hour course.
>Results: The overall average incidence of adverse events reported was determined by 4.8%, is consistent with the expectations of the study protocol, and is at a lower level than the average median rate of international studies (8.9%).
看起來nextLine()在有非ASCII字符時失敗。任何想法爲什麼發生這種情況
你確定該文件被編碼爲UTF-8嗎? –
@DavidWallace是的。經過進一步的思考,我注意到'\ xe2 \ x80 \ xa9'的序列是來自這裏http://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128&utf8的某種形式的段落分割器=字符串文字 – kolonel
@DavidWallace關於如何避免任何不是新行字符的任何想法? – kolonel