我正在構建一個15k行培訓數據文檔,名爲:en-ner-person.train,按照在線手冊(http://opennlp.apache.org/documentation/1.5.2-incubating/manual /opennlp.html)。打開NLP名稱查找器培訓
我的問題是:在我的培訓文檔中,是否包含整個報告?或者,我是否只包含名稱爲<START:person> John Smith <END>
的行?
因此,例如,我在我的訓練數據使用此報告全文:
<START:person> Pierre Vinken <END> , 61 years old , will join the board as a nonexecutive director Nov. 29 .
A nonexecutive director has many similar responsibilities as an executive director.
However, there are no voting rights with this position.
Mr . <START:person> Vinken <END> is chairman of Elsevier N.V. , the Dutch publishing group .
還是我只包括我的培訓文件中這兩行:
<START:person> Pierre Vinken <END> , 61 years old , will join the board as a nonexecutive director Nov. 29 .
Mr . <START:person> Vinken <END> is chairman of Elsevier N.V. , the Dutch publishing group .