回答

3

這些屬性都是不能被覆蓋的int數組。您仍然可以修改這些數組的元素。這不會減輕數據。

children_left : array of int, shape [node_count] 
    children_left[i] holds the node id of the left child of node i. 
    For leaves, children_left[i] == TREE_LEAF. Otherwise, 
    children_left[i] > i. This child handles the case where 
    X[:, feature[i]] <= threshold[i]. 

children_right : array of int, shape [node_count] 
    children_right[i] holds the node id of the right child of node i. 
    For leaves, children_right[i] == TREE_LEAF. Otherwise, 
    children_right[i] > i. This child handles the case where 
    X[:, feature[i]] > threshold[i]. 

feature : array of int, shape [node_count] 
    feature[i] holds the feature to split on, for the internal node i. 

threshold : array of double, shape [node_count] 
    threshold[i] holds the threshold for the internal node i. 

爲了通過節點中的觀察數來修剪DecisionTree,我使用了這個函數。你需要知道TREE_LEAF常量等於-1。

def prune(decisiontree, min_samples_leaf = 1): 
    if decisiontree.min_samples_leaf >= min_samples_leaf: 
     raise Exception('Tree already more pruned') 
    else: 
     decisiontree.min_samples_leaf = min_samples_leaf 
     tree = decisiontree.tree_ 
     for i in range(tree.node_count): 
      n_samples = tree.n_node_samples[i] 
      if n_samples <= min_samples_leaf: 
       tree.children_left[i]=-1 
       tree.children_right[i]=-1 

這裏是產生graphviz的輸出之前和之後的例子:

[from sklearn.tree import DecisionTreeRegressor as DTR 
from sklearn.datasets import load_diabetes 
from sklearn.tree import export_graphviz as export 

bunch = load_diabetes() 
data = bunch.data 
target = bunch.target 

dtr = DTR(max_depth = 4) 
dtr.fit(data,target) 

export(decision_tree=dtr.tree_, out_file='before.dot') 
prune(dtr, min_samples_leaf = 100) 
export(decision_tree=dtr.tree_, out_file='after.dot')][1] 
+0

感謝這個人。 –