錯誤：Annotator「情緒」需要註釋「binarized_trees」

任何人都可以幫助我，當這種錯誤可能發生。任何想法真的很感激。我需要添加任何內容，任何註釋器。這是一個與我從默認模型中分離出來的數據或模型有關的問題。錯誤：Annotator「情緒」需要註釋「binarized_trees」

我使用Standford NLP 3.4.1爲社交媒體數據進行情感計算。當我通過spark/scala工作運行它時，我得到了一些數據的下面的錯誤。

java.lang.IllegalArgumentException: annotator "sentiment" requires annotator "binarized_trees" 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:300) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125) 
    at com.pipeline.sentiment.NonTwitterSentimentAndThemeProcessorAction$.create(NonTwitterTextEnrichmentComponent.scala:142) 
    at com.pipeline.sentiment.NonTwitterTextEnrichmentInitialized.action$lzycompute(NonTwitterTextEnrichmentComponent.scala:52) 
    at com.pipeline.sentiment.NonTwitterTextEnrichmentInitialized.action(NonTwitterTextEnrichmentComponent.scala:50) 
    at com.pipeline.sentiment.NonTwitterTextEnrichmentInitialized.action(NonTwitterTextEnrichmentComponent.scala:49)

這裏是Scala代碼，我有

def create(features: Seq[String] = Seq("tokenize", "ssplit", "pos","parse","sentiment")): TwitterSentimentAndThemeAction = { 
     println("comes inside the TwitterSentimentAndThemeProcessorAction create method") 
     val props = new Properties() 
     props.put("annotators", features.mkString(", ")) 
     props.put(""pos.model", "tagger/gate-EN-twitter.model"); 
     props.put("parse.model", "tagger/englishSR.ser.gz"); 
     val pipeline = new StanfordCoreNLP(props)

任何幫助非常感謝。感謝您的幫助

來源

2015-05-26 user2052854

你1個機1個線程上運行此代碼？ –

不用我在hadoop/spark上運行它，有200分區 – user2052854

呵呵;我認爲情緒只需要解析註釋器。如果明確添加BinarizerAnnotator會發生什麼？即，添加'binarizer'到註釋器，以及以下屬性：'props.setProperty（「customAnnotatorClass.binarizer」，「edu.stanford.nlp.pipeline.BinarizerAnnotator」）' –

...你確定這是你得到的錯誤嗎？與您的代碼，我得到一個錯誤

Loading parser from serialized file tagger/englishSR.ser.gz ...edu.stanford.nlp.io.RuntimeIOException: java.io.IOException: Unable to resolve "tagger/englishSR.ser.gz" as either class path, filename or URL

這更有道理。這個轉換減少了解析器模型的生命力，它的生命值爲edu/stanford/nlp/models/srparser/englishSR.ser.gz。如果我不使用shift reduce模型，那麼寫入的代碼對我來說工作正常;同樣，如果我在上面包含模型路徑，它的工作正常。

我想確切的代碼是：

#!/bin/bash 
exec scala -J-mx4g "$0" "[email protected]" 
!# 

import scala.collection.JavaConversions._ 
import edu.stanford.nlp.pipeline._ 
import java.util._ 

val props = new Properties() 
props.put("annotators", Seq("tokenize", "ssplit", "pos","parse","sentiment").mkString(", ")) 
props.put("parse.model", "edu/stanford/nlp/models/srparser/englishSR.ser.gz"); 
val pipeline = new StanfordCoreNLP(props)

來源

2015-05-27 19:51:21

Gabor，謝謝您的關注。我已經下載兼容的3.4.1模型englishSR.ser.gz並將其放置在tagger目錄中。對於正常情況，我沒有看到錯誤。當我用spark/hadoop運行超過200分區時，出現此錯誤 – user2052854

我懷疑這是Spark集羣的配置錯誤。我不是我有資格幫助那裏，雖然... –

錯誤：Annotator「情緒」需要註釋「binarized_trees」

回答

相關問題