打印決策樹和feature_importance使用BaggingClassifier

時，獲得決策樹並使用DecisionTreeClassifier時scikit學習的重要特徵可以很容易。然而，如果我和裝袋功能，例如BaggingClassifier，我無法獲得它們。打印決策樹和feature_importance使用BaggingClassifier

由於我們需要使用BaggingClassifier來擬合模型，因此無法返回與DecisionTreeClassifier相關的結果（打印樹（圖），feature_importances_，...）。

海爾是我的腳本：

seed = 7 
n_iterations = 199 
DTC = DecisionTreeClassifier(random_state=seed, 
               max_depth=None, 
               min_impurity_split= 0.2, 
               min_samples_leaf=6, 
               max_features=None, #If None, then max_features=n_features. 
               max_leaf_nodes=20, 
               criterion='gini', 
               splitter='best', 
               ) 

#parametersDTC = {'max_depth':range(3,10), 'max_leaf_nodes':range(10, 30)} 
parameters = {'max_features':range(1,200)} 
dt = RandomizedSearchCV(BaggingClassifier(base_estimator=DTC, 
           #max_samples=1, 
           n_estimators=100, 
           #max_features=1, 
           bootstrap = False, 
           bootstrap_features = True, random_state=seed), 
         parameters, n_iter=n_iterations, n_jobs=14, cv=kfold, 
         error_score='raise', random_state=seed, refit=True) #min_samples_leaf=10 

# Fit the model 

fit_dt= dt.fit(X_train, Y_train) 
print(dir(fit_dt)) 
tree_model = dt.best_estimator_ 

# Print the important features (NOT WORKING) 

features = tree_model.feature_importances_ 
print(features) 

rank = np.argsort(features)[::-1] 
print(rank[:12]) 
print(sorted(list(zip(features)))) 

# Importing the image (NOT WORKING) 
from sklearn.externals.six import StringIO 

tree.export_graphviz(dt.best_estimator_, out_file='tree.dot') # necessary to plot the graph 

dot_data = StringIO() # need to understand but it probably relates to read of strings 
tree.export_graphviz(dt.best_estimator_, out_file=dot_data, filled=True, class_names= target_names, rounded=True, special_characters=True) 
graph = pydotplus.graph_from_dot_data(dot_data.getvalue()) 

img = Image(graph.create_png()) 
print(dir(img)) # with dir we can check what are the possibilities in graph.create_png 

with open("my_tree.png", "wb") as png: 
    png.write(img.data)

我獲得像誤差修改： 'BaggingClassifier' 對象有沒有屬性 'tree_' 和 'BaggingClassifier' 對象有沒有屬性 'feature_importances'。有誰知道我該如何獲得它們？謝謝。

來源

2017-07-25 Mauro Nogueira

可能重複[功能重要 - 套袋，scikit學習]（https://stackoverflow.com/questions/44333573/feature-importances-bagging-scikit-learn） –

@MikhailKorobov這不是問題的重複在鏈接中。鏈接中的問題只討論feature_importance屬性，而OP也有興趣訪問樹本身。 –

基於the documentation，BaggingClassifier對象確實不具備屬性「feature_importances」。你還可以自己計算出其作爲答案說明這個問題：Feature importances - Bagging, scikit-learn

您可以訪問使用屬性estimators_ BaggingClassifier的裝配過程中產生的樹木，如下面的例子：

from sklearn import svm, datasets 
from sklearn.model_selection import GridSearchCV 
from sklearn.ensemble import BaggingClassifier 


iris = datasets.load_iris() 
clf = BaggingClassifier(n_estimators=3) 
clf.fit(iris.data, iris.target) 
clf.estimators_

clf.estimators_是3個裝決策樹的列表：

[DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None, 
      max_features=None, max_leaf_nodes=None, 
      min_impurity_split=1e-07, min_samples_leaf=1, 
      min_samples_split=2, min_weight_fraction_leaf=0.0, 
      presort=False, random_state=1422640898, splitter='best'), 
DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None, 
      max_features=None, max_leaf_nodes=None, 
      min_impurity_split=1e-07, min_samples_leaf=1, 
      min_samples_split=2, min_weight_fraction_leaf=0.0, 
      presort=False, random_state=1968165419, splitter='best'), 
DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None, 
      max_features=None, max_leaf_nodes=None, 
      min_impurity_split=1e-07, min_samples_leaf=1, 
      min_samples_split=2, min_weight_fraction_leaf=0.0, 
      presort=False, random_state=2103976874, splitter='best')]

所以，你可以在列表上迭代，並訪問樹中的每一個。

來源

2017-07-25 17:30:02

感謝@Miriam Farber有一個決策樹列表，如果我想用上面的腳本打印樹（導入圖像），我是否應該使用列表中返回的參數單獨運行每個決策樹？ –

@MauroNogueira是的。你不需要複製參數，只需要「for clf.estimators_：」，然後在循環中運行你以前用於signle樹的代碼。循環中的每個「t」都是一個擬合的決策樹。 –

我已經嘗試了這一點，但我得到了與export_gaphviz誤差修改，如「清單」對象有沒有屬性「tree_」在dt.estimators_ 爲T： export_graphviz（dt.estimators_，out_file =「tree.dot」） dot_data = StringIO（）讀取字符串 export_graphviz（dt.estimators_，out_file = dot_data，filled = True，class_names = target_names，rounded = True，special_characters = True） graph = pydotplus.graph_from_dot_data（dot_data.getvalue（）） IMG =圖像（graph.create_png（））打印（DIR（IMG））張開（「HDAC8_tree.png」，「WB」）作爲PNG： png.write（img.data） –

打印決策樹和feature_importance使用BaggingClassifier

回答

相關問題