如何訓練自定義模型opeennlp？

我想訓練自己的自定義模型。從哪裏可以開始？如何訓練自定義模型opeennlp？

我使用這個樣本數據來訓練模型：

<START:meaningless>Took connection and<END> selected the Text in the Letter Template and cleared the Formatting of Text to Normal.

基本上我希望找出從給定輸入一些毫無意義的文字。

我試着用opennlp開發文檔給出的以下示例代碼但是出現錯誤：模型與name finder不兼容！

Charset charset = Charset.forName("UTF-8"); 
ObjectStream<String> lineStream = 
     new PlainTextByLineStream(new FileInputStream("mynewmodel.train"), charset); 
ObjectStream<NameSample> sampleStream = new NameSampleDataStream(lineStream); 

TokenNameFinderModel model; 

try { 
    model = NameFinderME.train("en", "meaningless", sampleStream, 
     Collections.<String, Object>emptyMap(), 100, 5); 
} 
finally { 
    sampleStream.close(); 
} 

try { 
    modelOut = new BufferedOutputStream(new FileOutputStream(modelFile)); 
    model.serialize(modelOut); 
} finally { 
    if (modelOut != null) 
    modelOut.close();  
}

來源

2013-10-25 Karan Dubal

一個問題，什麼樣的文件是「mynewmodel.train」？ –

可能存在的問題：您沒有提供明確標記文本的培訓師。 PlainTextByLineStream，如果我正確理解文檔，需要空格分隔的令牌。所以

<START:meaningless> Took connection and <END>

而不是

<START:meaningless>Took connection and<END>

來源

2013-11-14 17:23:44

如何訓練自定義模型opeennlp？

回答

相關問題