使用scikit-learn爲NER訓練NLP對數線性模型

我想知道如何使用sklearn.linear_model.LogisticRegression來訓練用於命名實體識別（NER）的NLP對數線性模型。使用scikit-learn爲NER訓練NLP對數線性模型

對於一個典型的對數 - 線性模型定義如下的條件概率：

與：

X：當前字
Y：類的單詞的被考慮爲
f：特徵向量函數，它將單詞x和類y映射到標量向量。
五：特徵權重向量

能sklearn.linear_model.LogisticRegression火車這樣的模式？

問題是功能取決於類。

2015-10-20 Franck Dernoncourt

在scikit-learn 0.16和更高版本中，您可以使用multinomial選項sklearn.linear_model.LogisticRegression來訓練對數線性模型（又名MaxEnt分類器，多類邏輯迴歸）。目前multinomial選項是由'lbfgs'和'newton-cg'求解器。

與虹膜數據集實施例（4個特徵，3類，150個樣品）：

#!/usr/bin/python 
# -*- coding: utf-8 -*- 

from __future__ import print_function 
from __future__ import division 

import numpy as np 
import matplotlib.pyplot as plt 
from sklearn import linear_model, datasets 
from sklearn.metrics import confusion_matrix 
from sklearn.metrics import classification_report 

# Import data 
iris = datasets.load_iris() 
X = iris.data # features 
y_true = iris.target # labels 

# Look at the size of the feature matrix and the label vector: 
print('iris.data.shape: {0}'.format(iris.data.shape)) 
print('iris.target.shape: {0}\n'.format(iris.target.shape)) 

# Instantiate a MaxEnt model 
logreg = linear_model.LogisticRegression(C=1e5, multi_class='multinomial', solver='lbfgs') 

# Train the model 
logreg.fit(X, y_true) 
print('logreg.coef_: \n{0}\n'.format(logreg.coef_)) 
print('logreg.intercept_: \n{0}'.format(logreg.intercept_)) 

# Use the model to make predictions 
y_pred = logreg.predict(X) 
print('\ny_pred: \n{0}'.format(y_pred)) 

# Assess the quality of the predictions 
print('\nconfusion_matrix(y_true, y_pred):\n{0}\n'.format(confusion_matrix(y_true, y_pred))) 
print('classification_report(y_true, y_pred): \n{0}'.format(classification_report(y_true, y_pred)))

的multinomial選項sklearn.linear_model.LogisticRegressionwas introduced in version 0.16：

添加multi_class="multinomial"選項：類：linear_model.LogisticRegression實施Logistic 迴歸求解器，最小化交叉熵或多項式損失而不是默認的One-vs-Rest設置。支持lbfgs和求解器。通過Lars Buitinck _和Manoj Kumar _。求解器選項 Simon Wu。

來源

2015-10-28 16:37:38

使用scikit-learn爲NER訓練NLP對數線性模型

回答

相關問題