2017-08-02 55 views

回答

0

有沒有辦法直接做到這一點:Catboost目前不支持模型序列化。

但是,Catboost已經可以將模型轉換爲CoreML,並且有一個CoreML工具可以將模型序列化爲類似JSON的文本。享受低保例如:

from sklearn import datasets 
iris = datasets.load_iris() 

import catboost 
# the shortest possible model specification 
cls = catboost.CatBoostClassifier(loss_function='MultiClass', iterations=1, depth=1) 
cls.fit(iris.data, iris.target) 

# save model to CoreML format 
cls.save_model(
    "iris.mlmodel", 
    format="coreml", 
    export_parameters={ 
     'prediction_type': 'probability' 
    } 
) 

# there is a CoreML tool for model serialization 
import coremltools 
model = coremltools.models.model.MLModel("iris.mlmodel") 
model.get_spec() 

你可能需要閱讀coremltools documentation要充分認識這是什麼代碼打印,但你可以閱讀這樣的輸出:"There is an ensemble of a single tree with 2 leaves - in the leaf 0, class 0 dominates, in the leaf 1 - classes 1 and 2. Go to the leaf 1, if feature 3 is larger than 0.8, otherwise go to leaf 0"

specificationVersion: 1 
description { 
    input { 
    name: "feature_3" 
    type { 
     doubleType { 
     } 
    } 
    } 
    output { 
    name: "prediction" 
    type { 
     multiArrayType { 
     shape: 3 
     dataType: DOUBLE 
     } 
    } 
    } 
    predictedFeatureName: "prediction" 
    predictedProbabilitiesName: "prediction" 
    metadata { 
    shortDescription: "Catboost model" 
    versionString: "1.0.0" 
    author: "Mr. Catboost Dumper" 
    } 
} 
treeEnsembleRegressor { 
    treeEnsemble { 
    nodes { 
     nodeBehavior: LeafNode 
     evaluationInfo { 
     evaluationValue: 0.05084745649058943 
     } 
     evaluationInfo { 
     evaluationIndex: 1 
     evaluationValue: -0.025423728245294732 
     } 
     evaluationInfo { 
     evaluationIndex: 2 
     evaluationValue: -0.025423728245294732 
     } 
    } 
    nodes { 
     nodeId: 1 
     nodeBehavior: LeafNode 
     evaluationInfo { 
     evaluationValue: -0.02752293516463098 
     } 
     evaluationInfo { 
     evaluationIndex: 1 
     evaluationValue: 0.01376146758231549 
     } 
     evaluationInfo { 
     evaluationIndex: 2 
     evaluationValue: 0.013761467582315471 
     } 
    } 
    nodes { 
     nodeId: 2 
     nodeBehavior: BranchOnValueGreaterThan 
     branchFeatureIndex: 3 
     branchFeatureValue: 0.800000011920929 
     trueChildNodeId: 1 
    } 
    numPredictionDimensions: 3 
    basePredictionValue: 0.0 
    basePredictionValue: 0.0 
    basePredictionValue: 0.0 
    } 
    postEvaluationTransform: Classification_SoftMax 
} 

有一個缺點這種方法:CoreML不支持Catboost使用分類功能的方式。因此,如果您想要序列化具有分類功能的模型,則需要在訓練之前對其進行熱編碼。

0

如果切換到使用命令行程序,則可以使用--print-trees選項。它只顯示正在訓練的模型樹。所以你不能爲現有的模型獲取樹。