Java掃描器hasNext（字符串）方法有時不匹配

我試圖使用Java掃描儀hasNext方法，但我得到了奇怪的結果。也許我的問題很明顯，但爲什麼這個簡單的簡單表達式"[a-zA-Z']+"不適用於這樣的詞語：「分，任何事，主管」。我也試過這個"[\\w']+"。Java掃描器hasNext（字符串）方法有時不匹配

public HashMap<String, Integer> getDocumentWordStructureFromPath(File file) { 
    HashMap<String, Integer> dictionary = new HashMap<>(); 
    try { 
     Scanner lineScanner = new Scanner(file); 
     while (lineScanner.hasNextLine()) { 
      Scanner scanner = new Scanner(lineScanner.nextLine()); 
      while (scanner.hasNext("[\\w']+")) { 
       String word = scanner.next().toLowerCase(); 
       if (word.length() > 2) { 
        int count = dictionary.containsKey(word) ? dictionary.get(word).intValue() + 1 : 1; 
        dictionary.put(word, new Integer(count)); 
       } 
      } 
      scanner.close(); 
     } 
     //scanner.useDelimiter(DELIMITER); 
     lineScanner.close(); 

     return dictionary; 

    } catch (FileNotFoundException e) { 
     e.printStackTrace(); 
     return null; 
    } 
}

來源

2013-04-07 flatronka

你的正則表達式應該是這樣的[^a-zA-z]+，因爲你需要所有不信的東西分開：

// previous code... 
Scanner scanner = new Scanner(lineScanner.nextLine()).useDelimiter("[^a-zA-z]+"); 
    while (scanner.hasNext()) { 
     String word = scanner.next().toLowerCase(); 
     // ...your other code 
    } 
} 
// ... after code

EDIT--爲什麼不與hasNext（String）方法工作??

這條線：

Scanner scanner = new Scanner(lineScanner.nextLine());

它確實是編譯whitespce模式適合你，所以如果你有例如該檢測線"Hello World. A test, ok."它會提供你這個令牌：

你好
世界。
A
test，
ok。

然後，如果你使用scanner.hasNext("[a-ZA-Z]+")你問掃描儀if there is a token that match your pattern，在這個例子就說明true第一個令牌：

你好（因爲這是弗里斯特的憑證，該模式匹配指定）

下一個標記（世界。）it doesn't match the pattern所以它只會fail和scanner.hasNext("[a-ZA-Z]+")將漚甕false所以它永遠不會用於任何不是字母的字符前面的單詞。你懂了？

現在...希望這可以幫助。

來源

2013-04-07 17:48:24

非常感謝@Angel Rodriguez這是一個很好的解決方案，但我不知道爲什麼不與hasnext（String）函數一起工作。 – flatronka 2013-04-07 18:02:34

好吧，我明白了你的意思，我已經編輯過......我解釋了爲什麼它不起作用......希望它有助於... – 2013-04-07 18:35:39

非常感謝你我已經得到了它。非常感謝您的幫助。 +1進行詳細解釋。 – flatronka 2013-04-07 23:00:24

Java掃描器hasNext（字符串）方法有時不匹配

回答

相關問題