2015-10-07 61 views
0

我試圖在Apache Spark的MLlib website上運行示例。以下是我的代碼:AttributeError:'MatrixFactorizationModel'對象沒有'save'屬性

import sys 
import os 

os.environ['SPARK_HOME'] = "/usr/local/Cellar/apache-spark/1.2.1" 
sys.path.append("/usr/local/Cellar/apache-spark/1.2.1/libexec/python") 
sys.path.append("/usr/local/Cellar/apache-spark/1.2.1/libexec/python/build") 

try: 
    from pyspark import SparkContext, SparkConf 
    from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel, Rating 
    print ("Apache-Spark v1.2.1 >>> All modules found and imported successfully.") 

except ImportError as e: 
    print ("Couldn't import Spark Modules", e) 
    sys.exit(1) 

# SETTING CONFIGURATION PARAMETERS 
config = (SparkConf() 
     .setMaster("local") 
     .setAppName("Music Recommender") 
     .set("spark.executor.memory", "16G") 
     .set("spark.driver.memory", "16G") 
     .set("spark.executor.cores", "8")) 
sc = SparkContext(conf=config) 

# Load and parse the data 
data = sc.textFile("data/1aa") 
ratings = data.map(lambda l: l.split('\t')).map(lambda l: Rating(int(l[0]), int(l[1]), float(l[2]))) 

# Build the recommendation model using Alternating Least Squares 
rank = 10 
numIterations = 10 
model = ALS.train(ratings, rank, numIterations) 

# Evaluate the model on training data 
testdata = ratings.map(lambda p: (p[0], p[1])) 
predictions = model.predictAll(testdata).map(lambda r: ((r[0], r[1]), r[2])) 
ratesAndPreds = ratings.map(lambda r: ((r[0], r[1]), r[2])).join(predictions) 
MSE = ratesAndPreds.map(lambda r: (r[1][0] - r[1][1])**2).mean() 
print("Mean Squared Error = " + str(MSE)) 

# Save and load model 
model.save(sc, "/Users/kunal/Developer/MusicRecommender") 
sameModel = MatrixFactorizationModel.load(sc, "/Users/kunal/Developer/MusicRecommender/data") 

該代碼正在運行,直到打印MSE。最後一步是將模型保存到目錄。我收到以下錯誤'MatrixFactorizationModel' object has no attribute 'save'(我已經貼在日誌的最後幾行):

15/10/06 21:00:16 INFO DAGScheduler: Stage 200 (mean at /Users/kunal/Developer/MusicRecommender/collabfiltering.py:41) finished in 12.875 s 
15/10/06 21:00:16 INFO DAGScheduler: Job 8 finished: mean at /Users/kunal/Developer/MusicRecommender/collabfiltering.py:41, took 53.290203 s 
Mean Squared Error = 405.148403002 
Traceback (most recent call last): 
    File "/Users/kunal/Developer/MusicRecommender/collabfiltering.py", line 47, in <module> 
    model.save(sc, path) 
AttributeError: 'MatrixFactorizationModel' object has no attribute 'save' 

Process finished with exit code 1 

我重新安裝,並確保我有最新版本的星火,但沒有幫助它。 我正在10MB文件上運行這個文件,它只是大文件的一小部分。

操作系統:OSX 10.11.1測試版(15B22c)

回答

1

這是因爲你用星火1.2.1和MatrixFactorizationModel.save方法已在星火1.3.0被引入。此外,您使用的文檔涵蓋了當前版本(1.5.1)。

星火文檔的URL看起來像這樣:

http://spark.apache.org/docs/SPARK_VERSION/some_topic.html 

所以你的情況,你應該使用:

http://spark.apache.org/docs/1.2.1/mllib-collaborative-filtering.html 
+0

燁說做到了!我忽略了這個根本的錯誤。 –