2017-08-28 127 views
0

使用基於星火RDD API(mllib包)1.5.2我已經訓練機器學習模型說 「Mymodel123」,如何將ML模型轉換爲MLlib模型?

org.apache.spark.mllib.tree.model.RandomForestModel Mymodel123 = ....; 
Mymodel123.save("sparkcontext","path"); 

現在我使用的Spark數據集基於API(ML封裝)2.2.0。有沒有辦法使用基於數據集的API加載模型(Mymodel123)?

org.apache.spark.ml.classification.RandomForestClassificationModel newModel = org.apache.spark.ml.classification.RandomForestClassificationModel.load("sparkcontext","path"); 

回答

1

沒有公共的API,可以做到這一點,但是你RandomForestModels換舊mllib API和provide private methods可用於轉換mllib模型ml型號:

/** Convert a model from the old API */ 
private[ml] def fromOld(
    oldModel: OldRandomForestModel, 
    parent: RandomForestClassifier, 
    categoricalFeatures: Map[Int, Int], 
    numClasses: Int, 
    numFeatures: Int = -1): RandomForestClassificationModel = { 
    require(oldModel.algo == OldAlgo.Classification, "Cannot convert RandomForestModel" + 
    s" with algo=${oldModel.algo} (old API) to RandomForestClassificationModel (new API).") 
    val newTrees = oldModel.trees.map { tree => 
    // parent for each tree is null since there is no good way to set this. 
    DecisionTreeClassificationModel.fromOld(tree, null, categoricalFeatures) 
    } 
    val uid = if (parent != null) parent.uid else Identifiable.randomUID("rfc") 
    new RandomForestClassificationModel(uid, newTrees, numFeatures, numClasses) 
} 

所以也不是沒有可能。在Java中,您可以直接使用它(Java不尊重包私有修飾符),在Scala中,您必須將適配器代碼放在org.apache.spark.ml包中。