0
我正在使用CoreNLP在多行英文文本中註釋NE。當做如下:CRFClassifier無法識別語句分隔符選項
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner");
props.put("ssplit.newlineIsSentenceBreak", "always");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
String contentStr = "John speaks with Martin\n\nJeremy talks to him too.";
Annotation document
= new Annotation(contentStr);
pipeline.annotate(document);
List<CoreMap> sents = document.get(SentencesAnnotation.class);
for (int i = 0; i < sents.size(); i++) {
System.out.println("sentence " + i + " "+ sents.get(i));
}
句子拆分工作正常,承認兩句話。然而,當我使用NER分類如下:
CRFClassifier classifier = CRFClassifier.getClassifier("edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz", props);
String classifiedStr = classifier.classifyWithInlineXML(contentStr);
我收到以下錯誤信息:
Unknown property: |ssplit.newlineIsSentenceBreak| Unknown property: |annotators|
和分類似乎認爲所有文本作爲一個句子產生誤認的實體「馬丁傑里米」,而不是兩個不同的實體。
任何想法有什麼不對?
Thanks @Mohamed Selim。這只是答案! – Bahaa