0
我試圖用pySpark 這裏是我的代碼LogisticRegressionwithLBFGS拋出錯誤有關不支持Mulitinomial分類
from pyspark.mllib.classification import LogisticRegressionWithLBFGS
from time import time
from pyspark.mllib.regression import LabeledPoint
from numpy import array
RES_DIR="/home/shaahmed115/Pet_Projects/DA/TwitterStream_US_Elections/Features/"
sc= SparkContext('local','pyspark')
data_file = RES_DIR + "training.txt"
raw_data = sc.textFile(data_file)
print "Train data size is {}".format(raw_data.count())
test_data_file = RES_DIR + "testing.txt"
test_raw_data = sc.textFile(test_data_file)
print "Test data size is {}".format(test_raw_data.count())
def parse_interaction(line):
line_split = line.split(",")
return LabeledPoint(float(line_split[0]), array([float(x) for x in line_split]))
training_data = raw_data.map(parse_interaction)
logit_model = LogisticRegressionWithLBFGS.train(training_data,iterations=10, numClasses=3)
這是拋出一個錯誤來實現Logistic迴歸: 目前,邏輯迴歸與ElasticNet在ML封裝只支持二進制分類。在輸入數據集
下面找到3是我的數據集的一個示例: 2,1.0,1.0,1.0 0,1.0,1.0,1.0 1,0.0,0.0,0.0
第一個元素是這個班,其餘的是矢量。你可以看到有三個班。 有沒有可以使多項分類適用於此的解決方法?