斯坦福標記者NullPointerException異常

我只是打從StanfordNLP的內臟奇數異常，當試圖來標記：斯坦福標記者NullPointerException異常

顯示java.lang.NullPointerException在 edu.stanford.nlp.process.PTBLexer.zzRefill（ PTBLexer.java:24511）在 edu.stanford.nlp.process.PTBLexer.next（PTBLexer.java:24718）在 edu.stanford.nlp.process.PTBTokenizer.getNext（PTBTokenizer.java:276）在 EDU .stanford.nlp.process.PTBTokenizer.getNext（PTBTokenizer.java:163） at edu.stanford.nlp.process.AbstractTokenizer.hasNext（AbstractTokeniz er.java:55）在 edu.stanford.nlp.process.DocumentPreprocessor $ PlainTextIterator.primeNext（DocumentPreprocessor.java:270）在 edu.stanford.nlp.process.DocumentPreprocessor $ PlainTextIterator.hasNext（DocumentPreprocessor.java： 334）

導致它看起來像這樣的代碼：

DocumentPreprocessor dp = new DocumentPreprocessor(new StringReader(
      tweet)); 

    // unigrams 
    for (List<HasWord> sentence : dp) { 
     for (HasWord word : sentence) { 
      // do stuff 
     } 
    } 

    // bigrams 
    for (List<HasWord> sentence : dp) { //<< exception is thrown here 
     Iterator<HasWord> it = sentence.iterator(); 
     String st1 = it.next().word(); 
     while (it.hasNext()) { 
      String st2 = it.next().word(); 
      String bigram = st1 + " " + st2; 
      // do stuff 
      st1 = st2; 
     } 
    }

這是怎麼回事？這跟我做了兩次循環令牌嗎？

來源

2014-12-18 kutschkem

這當然是一個醜陋的堆棧跟蹤，可以並且應該改進。（我即將檢查這個問題的解決方案。）但是，這不起作用的原因是DocumentProcessor就像一個Reader：它只能讓你在文檔的句子中單次傳遞。因此，在第一個for循環之後，文檔耗盡，並且底層Reader已關閉。因此，第二個for循環失敗，並在這裏崩潰在詞法分析器深處。我會改變它，所以它只會給你什麼。但是爲了得到你想要的東西，你要麼（最高效的）在一個for-loop中傳遞unigrams和bigrams，要麼通過文檔，要麼爲第二個傳遞創建第二個DocumentPreprocessor。

來源

2014-12-19 05:42:12

我認爲it.next().word()正在造成它。

更改您的代碼，以便您可以先檢查it.hasNext()，然後再執行it.next().word()。

來源

2014-12-18 22:16:21

斯坦福標記者NullPointerException異常

回答

相關問題