在Python中查找頭文字

我需要提取句子的頭文字（更具體地說，是句子中最高名詞短語的頭文字）。我通過py-corenlp使用斯坦福CoreNLP服務器來註釋我的句子。該套件修改了Michael Collin的詞頭查找算法，但我還沒有找到任何通過服務器使用它的方法。我想避免重蹈覆轍，那麼有什麼方法可以通過Python中的現有工具實現？在Python中查找頭文字

實施例：

的數在1個摩爾的物質被稱爲什麼的基本實體的？

(ROOT 
    (S 
    (NP 
     (NP (DT The) (NN number)) 
     (PP (IN of) 
     (NP 
      (NP (JJ elementary) (NNS entities)) 
      (PP (IN in) 
      (NP 
       (NP (CD 1) (NN mole)) 
       (PP (IN of) 
       (NP (DT a) (NN substance)))))))) 
    (VP (VBZ is) 
     (VP (VBN known) 
     (PP (IN as) 
      (NP (WP what))))) 
    (. ?)))

「基本實體在1個摩爾的物質的數量」是最高的名詞短語。

「number」是我要提取的短語的首字。

編輯：增加的例子。

來源

2017-02-10 Yurgen Schembri

請添加一個例子，你想要什麼;） – Ika8

看起來使用類型化依賴而不是語法分析可能更容易。您的句子將被動詞引用，然後找到該動詞的依存關係nsubj或nsubjpas。例如：

root (ROOT-0 , known-13) <- Start with this one 
det (number-2 , The-1) 
nsubjpass (known-13 , number-2) <- Then this one 
case (entities-5 , of-3) 
amod (entities-5 , elementary-4) 
nmod (number-2 , entities-5) 
case (mole-8 , in-6) 
nummod (mole-8 , 1-7) 
nmod (entities-5 , mole-8) 
case (substance-11 , of-9) 
det (substance-11 , a-10) 
nmod (mole-8 , substance-11) 
auxpass (known-13 , is-12) 
case (what-15 , as-14) 
nmod (known-13 , what-15)

來源

2017-02-13 16:57:04

在Python中查找頭文字

回答

相關問題