2017-07-13 36 views
0

目前,在spaCy中,可以使用ent_iob_屬性(docs)獲取令牌的IOB標籤。例如: -如何在spaCy中獲取令牌的BILUO標籤?

>>> import spacy 
>>> nlp = spacy.load('en') 
>>> doc = nlp(u'My name is George Washington Singer, and I am an Englishman') 
>>> [i.ent_iob_ for i in doc] 
[u'O', u'O', u'O', u'B', u'I', u'I', u'O', u'O', u'O', u'O', u'O', u'B'] 

然而,訓練模型時,spacy需要碧羅(docs)。有沒有辦法將現成的IOB標籤轉換成BILUO,或直接獲取BILUO標籤?

Spacy版本1.8

回答

0

要IOB轉換爲碧羅,spacy.gold有_iob_to_biluo功能。

>>> import spacy 
>>> from spacy.gold import _iob_to_biluo 
>>> nlp = spacy.load('en') 
>>> doc = nlp(u'My name is George Washington Singer, and I am an Englishman') 
>>> iobs = [i.ent_iob_ for i in doc] 
>>> iob_to_biluo(iobs) 
[u'O', u'O', u'O', u'B-', u'I-', u'L-', u'O', u'O', u'O', u'O', u'O', u'U-'] 

來源可查詢here