在Python中使用NLTK Chinking

我一直在試用Python NLTK Book中的一些示例。例如，第7章談到了這個例子中的詮釋：在Python中使用NLTK Chinking

grammar = r""" 
    NP: 
    {<.*>+}   # Chunk everything 
    }<VBD|IN>+{  # Chink sequences of VBD and IN 
    """ 
sentence = [("the", "DT"), ("little", "JJ"), ("yellow", "JJ"), 
     ("dog", "NN"), ("barked", "VBD"), ("at", "IN"), ("the", "DT"), ("cat", "NN")] 
cp = nltk.RegexpParser(grammar) 
result = cp.parse(sentence)

據我所知，這應該從結果中「吠叫」。但事實並非如此。我是python和nltk的新手，但我在這裏錯過了什麼？有什麼明顯的需要更新嗎？謝謝..

來源

2012-11-30 Alps

chunking創建塊，而chinking拆散那些塊。

這就是Jacob Perkins所說的「Python文本處理與NLTK 2.0食譜」（我建議你這本書，因爲你是NLTK的新手）。

這意味着{}會創建一些塊並{}將這些塊拆分爲更小的塊（即將它們分開），但不會刪除任何內容。

根據您例子，看看什麼顯示

result.draw()

或交替運行

from nltk.tree import Tree

Tree('S', [Tree('NP', [('the', 'DT'), ('little', 'JJ'), ('yellow', 'JJ'), ('dog', 'NN')]), ('barked', 'VBD'), ('at', 'IN'), Tree('NP', [('the', 'DT'), ('cat', 'NN')])]).draw()

（上面的代碼示例顯示了同樣的事情。在不同之處在於第一個例子需要你首先運行whil第二個不需要任何東西）

來源

2012-12-21 07:29:13 Max

在Python中使用NLTK Chinking

回答

相關問題