斯坦福核心NLP缺少根

從在線演示Stanford CoreNLP與例句「可以單獨測試一個最小的軟件項目」，它給人以CC倒塌的依賴關係處理如下：斯坦福核心NLP缺少根

root (ROOT-0 , item-4) 
det (item-4 , A-1) 
amod (item-4 , minimal-2) 
nn (item-4 , software-3) 
nsubjpass (tested-8 , that-5) 
aux (tested-8 , can-6) 
auxpass (tested-8 , be-7) 
rcmod (item-4 , tested-8) 
prep_in (tested-8 , isolation-10)

從我的Java類，我得到除了根（...）。我正在運行的代碼如下：

public static void main(String[] args) 
    { 
     Properties props = new Properties(); 
     props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref"); 
     StanfordCoreNLP pipeline = new StanfordCoreNLP(props); 

     Annotation document = new Annotation(args[0]); 

     pipeline.annotate(document); 

     List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class); 

     for (CoreMap sentence : sentences) { 
      SemanticGraph dependencies = sentence.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class); 
      System.out.println(dependencies.toList()); 
     } 
    }

所以問題是爲什麼我的Java代碼不輸出root`s！？我錯過了什麼嗎？

來源

2013-04-30 werd

這是一個很好的問題，它在當前代碼中暴露了一個壞處。目前，一個根節點和它的一個邊不會被存儲在圖中*。相反，它們必須作爲圖的根的根/列表單獨訪問，作爲單獨的列表存儲。這裏有兩件事情，將工作：（1）增加System.out.println上面這段代碼：

IndexedWord root = dependencies.getFirstRoot(); 
System.out.printf("ROOT(root-0, %s-%d)%n", root.word(), root.index());

（2）使用的，而不是你的當前行：

System.out.println(dependencies.toString("readable"));

不像其他toList()或toString()方法，它會打印根（s）。

*有這樣的歷史原因：我們以前沒有任何明確的根。但在這一點上，這種行爲是尷尬的，功能障礙，應該改變。它可能會在未來的版本中發生。

來源

2013-05-01 05:17:30

我設法找到了我的情況下，其他的解決方案： 'GrammaticalStructure GS = gsf.newGrammaticalStructure（樹）;'' 收集 TDL = gs.typedDependenciesCCprocessed（）;' – werd 2013-05-01 22:06:54

是的，行之有效的，因爲ROOT真的在這個依賴關係集合中。次要的成本是，你正在付錢讓它們從分析樹中第二次生成。 – 2013-05-02 22:43:32

斯坦福核心NLP缺少根

回答

相關問題