2017-02-08 50 views
2

上使用數據執行Ç5.0算法後,從C5.0模型提取矩陣

a <- C5.0(FACTOR~.,data = i_data,trials=10,costs = matrix(c(0,1,4,0), nrow = 2)) 

當我使用找到了該模型的概要,

summary(a) 

我得到這樣的事情,

. 
. 
. 
. 

SubTree [S1] 

Col_L > 89: N (195.6/6.5) 
Col_L <= 89: 
:...Col_Q > 4657: Y (66.6/34) 
    Col_Q <= 4657: 
    :...Col_F > 15: Y (117.6/75) 
     Col_F <= 15: 
     :...Col_C <= 5.6926: N (2040.5/266.7) 
      Col_C > 5.6926: Y (148.7/104.4) 

SubTree [S2] 

Col_E > 14: N (2523.3/176.8) 
Col_E <= 14: 
:...Col_G > 5: N (83.4/1.4) 
    Col_G <= 5: 
    :...Col_O > 6880: Y (41.8/22) 
     Col_O <= 6880: 
     :...Col_G <= 3: N (1939.9/230.1) 
      Col_G > 3: Y (92.7/64.5) 


Evaluation on training data (53392 cases): 

Trial   Decision Tree  
-----  ----------------------- 
    Size  Errors Cost 

    0  87 16173(30.3%) 0.35 
    1  25 14071(26.4%) 0.43 
    2  48 15295(28.6%) 0.74 
    3  50 14672(27.5%) 0.48 
    4  43 16765(31.4%) 0.55 
    5  52 16346(30.6%) 0.98 
    6  58 18277(34.2%) 0.52 
    7  65 13940(26.1%) 0.64 
    8  63 14020(26.3%) 0.42 
    9  57 13517(25.3%) 0.45 
    boost   13284(24.9%) 0.39 << 


    (a) (b) <-classified as 
    ---- ---- 
15848 10848 (a): class N 
    2436 24260 (b): class Y 


Attribute usage: 

100.00% Col_A 
100.00% Col_B 
100.00% Col_C 
100.00% Col_D 
100.00% Col_E 
99.79% Col_F 
99.63% Col_G 
76.66% Col_H 
76.55% Col_I 
75.64% Col_J 
70.22% Col_K 
65.15% Col_L 
59.01% Col_M 
58.94% Col_N 
42.54% Col_O 
33.01% Col_P 
21.73% Col_Q 
16.58% Col_R 
12.69% Col_S 
    8.43% Col_T 

有什麼辦法來提取該

(a) (b) <-classified as 
    ---- ---- 
15848 10848 (a): class N 
    2436 24260 (b): class Y 

從上面的總結,以便我可以加載它的R的另一個實例嗎?

回答

1

C5.0保存該文本,但你可以將其導出這樣的:

#example from ?C5.0 
data(churn) 
treeModel <- C5.0(x = churnTrain[, -20], y = churnTrain$churn) 
treeModel 
#saves summary in b 
#b$output is the printed text 
b <- summary(treeModel) 

#get position of '(a)' 
pos1 <- gregexpr(pattern ='\\(a\\)', b$output)[[1]][1] 
#get position of 'class no' - in your case should be class Y 
pos2 <- gregexpr(pattern ='class no', b$output)[[1]][1] 
#substring using the above 
text <- substr(b$output, pos1, pos2) 

#print 
cat(text) 

輸出:

(a) (b) <-classified as 
---- ---- 
365 118 (a): class yes 
18 2832 (b): c 
+0

它的工作,非常感謝巴德 – Raymond

+0

非常歡迎!樂於幫助 :) – LyzandeR