2015-05-21 29 views
1

我已經使用NLTK python包中的SklearnClassifier()包裝函數來訓練一些sklearn分類器(LogisticRegression()和RandomForest()),用於文本是特徵的二進制分類問題。是否有任何功能允許「展開」這個對象,以便人們可以訪問諸如參數估計(邏輯迴歸)或隨機森林中的變量重要性列表(或來自原始sklearn對象的任何項目) ? nltk分類器對象可以評分新實例,因此基礎信息必須包含在該對象中的某處?謝謝你的想法。「Unwrapping」SklearnClassifier對象 - NLTK Python

+1

歡迎堆棧溢出!您可能想查看[如何提出問題](http://stackoverflow.com/help/how-to-ask)。正確地格式化您的問題將有助於您找到所需的答案。 –

回答

0

您的分類器隱藏在_clf變量下。在http://www.nltk.org/_modules/nltk/classify/scikitlearn.html發現

classifier = SKLearnClassifier(MLPClassifier()) 
mlp = classifier._clf 

文檔:

def __init__(self, estimator, dtype=float, sparse=True): 
    """ 
    :param estimator: scikit-learn classifier object. 

    :param dtype: data type used when building feature array. 
     scikit-learn estimators work exclusively on numeric data. The 
     default value should be fine for almost all situations. 

    :param sparse: Whether to use sparse matrices internally. 
     The estimator must support these; not all scikit-learn classifiers 
     do (see their respective documentation and look for "sparse 
     matrix"). The default value is True, since most NLP problems 
     involve sparse feature sets. Setting this to False may take a 
     great amount of memory. 
    :type sparse: boolean. 
    """ 
    self._clf = estimator 
    self._encoder = LabelEncoder() 
    self._vectorizer = DictVectorizer(dtype=dtype, sparse=sparse)