NLP提取相關詞

使用NLP從給定的句子中，我可以使用Core NLP輕鬆提取所有形容詞和名詞。但是，我正在努力做的事情實際上是從句子中提取短語。NLP提取相關詞

，比如我有下面的句子：

這個人是值得信賴的。
此人不具判斷力。
這個人說得好。

對於所有這些使用NLP的句子，我想提取值得信賴的，非判斷性的，口語良好的等等。我想提取所有這些相關的單詞。

我該怎麼做？

感謝，

來源

2017-05-17 Sidhant

爲了您的具體使用情況Open Information Extraction似乎是一個合適的解決方案。它提取包含主題，關係和對象的三元組。你的關係似乎永遠是是（不定式的是）和你的主題似乎總是人，所以我們只對該對象感興趣。

import edu.stanford.nlp.ie.util.RelationTriple; 
import edu.stanford.nlp.ling.CoreAnnotations; 
import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation; 
import edu.stanford.nlp.pipeline.Annotation; 
import edu.stanford.nlp.pipeline.StanfordCoreNLP; 
import edu.stanford.nlp.naturalli.NaturalLogicAnnotations; 
import edu.stanford.nlp.util.CoreMap; 
import java.util.Collection; 
import java.util.Properties; 

public class OpenIE { 

    public static void main(String[] args) { 
     // Create the Stanford CoreNLP pipeline 
     Properties props = new Properties(); 
     props.setProperty("annotators", "tokenize,ssplit,pos,lemma,depparse,natlog,openie"); 
     StanfordCoreNLP pipeline = new StanfordCoreNLP(props); 

     // Annotate your sentences 
     Annotation doc = new Annotation("This person is trust worthy. This person is non judgemental. This person is well spoken."); 
     pipeline.annotate(doc); 

     // Loop over sentences in the document 
     for (CoreMap sentence : doc.get(CoreAnnotations.SentencesAnnotation.class)) { 
      // Get the OpenIE triples for the sentence 
      Collection<RelationTriple> triples = sentence.get(NaturalLogicAnnotations.RelationTriplesAnnotation.class); 
      // Print the triples 
      for (RelationTriple triple : triples) { 
       triple.object.forEach(object -> System.out.print(object.get(TextAnnotation.class) + " ")); 
       System.out.println(); 
      } 
     } 
    } 
}

輸出將是以下：

trust 
worthy 
non judgemental 
judgemental 
well spoken 
spoken

的OpenIE算法可能提取每個句子多個三元組。對於您的使用情況，解決方案可能是採用對象中單詞數量最多的三元組。

另一件要提及的是，你的第一句話的對象不是「正確」提取，至少不是你想要的方式。這是因爲信任是名詞和值得是一個形容詞。最簡單的解決方案是用連字符編寫（值得信賴的）。另一個可能的解決方案是檢查Part of Speech標籤，並在遇到名詞後跟一個形容詞時執行一些附加步驟。

來源

2017-06-13 11:29:23

要檢查類似短語之間的相似性，可以使用詞嵌入，如GLOVE。一些NLP庫帶有嵌入，比如Spacy。 https://spacy.io/usage/vectors-similarity

注意：Spacy在令牌級別和短語級別上都使用餘弦相似度，Spacy還爲較大的短語/句子提供便利的相似度函數。

例如：（使用spacy &蟒）

doc1 = nlp(u"The person is trustworthy.") 
doc2 = nlp(u"The person is non judgemental.") 
cosine_similarity = doc1.similarity(doc2)

而且cosine_similarity可用於顯示兩個短語/單詞/句子的相似程度，範圍從0到1，其中1是非常相似的。

來源

2018-03-05 21:25:24

NLP提取相關詞

回答

相關問題