2014-07-10 48 views
2

我正在嘗試使用livex日誌數據的graphx,https://snap.stanford.edu/data/soc-LiveJournal1.htmlApache spark(graphx)可能不會利用所有內核和內存

我有一個10個計算節點的集羣。每個計算節點都有64G RAM和32個內核。

當我使用9個工作節點運行pagerank算法時,比使用1個woker節點運行它要慢。由於某些配置問題,我懷疑我沒有利用所有內存和/或內核。

我經歷了火花的配置,調整和編程指南。

我使用的火花shell來運行它通過

./spark-shell --executor-memory 50g 

調用我有工人和主系統上運行的腳本。當我開始了火花殼我得到以下日誌

14/07/09 17:26:10 INFO Slf4jLogger: Slf4jLogger started 
14/07/09 17:26:10 INFO Remoting: Starting remoting 
14/07/09 17:26:10 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:60035] 
14/07/09 17:26:10 INFO Remoting: Remoting now listens on addresses: [akka.tcp://[email protected]:60035] 
14/07/09 17:26:10 INFO SparkEnv: Registering MapOutputTracker 
14/07/09 17:26:10 INFO SparkEnv: Registering BlockManagerMaster 
14/07/09 17:26:10 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20140709172610-7f5e 
14/07/09 17:26:10 INFO MemoryStore: MemoryStore started with capacity 294.4 MB. 
14/07/09 17:26:10 INFO ConnectionManager: Bound socket to port 45700 with id = ConnectionManagerId(node0472.local,45700) 
14/07/09 17:26:10 INFO BlockManagerMaster: Trying to register BlockManager 
14/07/09 17:26:10 INFO BlockManagerInfo: Registering block manager node0472.local:45700 with 294.4 MB RAM 
14/07/09 17:26:10 INFO BlockManagerMaster: Registered BlockManager 
14/07/09 17:26:10 INFO HttpServer: Starting HTTP Server 
14/07/09 17:26:10 INFO HttpBroadcast: Broadcast server started at http://172.16.104.72:48116 
14/07/09 17:26:10 INFO HttpFileServer: HTTP File server directory is /tmp/spark-7b4a7c3c-9fc9-4a64-b2ac-5f328abe9265 
14/07/09 17:26:10 INFO HttpServer: Starting HTTP Server 
14/07/09 17:26:11 INFO SparkUI: Started SparkUI at http://node0472.local:4040 
14/07/09 17:26:12 INFO AppClient$ClientActor: Connecting to master spark://node0472.local:7077... 
14/07/09 17:26:12 INFO SparkILoop: Created spark context.. 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20140709172612-0007 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/0 on worker-20140709162149-node0476.local-53728 (node0476.local:53728) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/0 on hostPort node0476.local:53728 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/1 on worker-20140709162145-node0475.local-56009 (node0475.local:56009) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/1 on hostPort node0475.local:56009 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/2 on worker-20140709162141-node0474.local-58108 (node0474.local:58108) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/2 on hostPort node0474.local:58108 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/3 on worker-20140709170011-node0480.local-49021 (node0480.local:49021) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/3 on hostPort node0480.local:49021 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/4 on worker-20140709165929-node0479.local-53886 (node0479.local:53886) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/4 on hostPort node0479.local:53886 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/5 on worker-20140709170036-node0481.local-60958 (node0481.local:60958) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/5 on hostPort node0481.local:60958 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/6 on worker-20140709162151-node0477.local-44550 (node0477.local:44550) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/6 on hostPort node0477.local:44550 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/7 on worker-20140709162138-node0473.local-42025 (node0473.local:42025) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/7 on hostPort node0473.local:42025 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor added: app-20140709172612-0007/8 on worker-20140709162156-node0478.local-52943 (node0478.local:52943) with 32 cores 
14/07/09 17:26:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140709172612-0007/8 on hostPort node0478.local:52943 with 32 cores, 50.0 GB RAM 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/1 is now RUNNING 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/0 is now RUNNING 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/2 is now RUNNING 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/3 is now RUNNING 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/6 is now RUNNING 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/4 is now RUNNING 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/5 is now RUNNING 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/8 is now RUNNING 
14/07/09 17:26:12 INFO AppClient$ClientActor: Executor updated: app-20140709172612-0007/7 is now RUNNING 
Spark context available as sc. 

scala> 14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:47343/user/Executor#1253632521] with ID 4 
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:39431/user/Executor#1607018658] with ID 2 
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:53722/user/Executor#-1846270627] with ID 5 
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:40185/user/Executor#-111495591] with ID 6 
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:36426/user/Executor#652192289] with ID 7 
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:37230/user/Executor#-1581927012] with ID 3 
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:46363/user/Executor#-182973444] with ID 1 
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:58053/user/Executor#609775393] with ID 0 
14/07/09 17:26:18 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://[email protected]:55152/user/Executor#-2126598605] with ID 8 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0474.local:60025 with 28.8 GB RAM 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0473.local:33992 with 28.8 GB RAM 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0481.local:46513 with 28.8 GB RAM 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0477.local:37455 with 28.8 GB RAM 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0475.local:33829 with 28.8 GB RAM 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0479.local:56433 with 28.8 GB RAM 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0480.local:38134 with 28.8 GB RAM 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0476.local:46284 with 28.8 GB RAM 
14/07/09 17:26:19 INFO BlockManagerInfo: Registering block manager node0478.local:43187 with 28.8 GB RAM 

根據記錄,我相信我的申請被註冊的工人和各執行過的RAM50克。現在,我在我的終端上運行下面的Scala代碼當我嘗試看看每個節點上的內存使用加載數據和計算的PageRank

import org.apache.spark._ 
import org.apache.spark.graphx._ 
import org.apache.spark.rdd.RDD 

val startgraphloading = System.currentTimeMillis; 
val graph = GraphLoader.edgeListFile(sc, "filepath").cache(); 
graph.cache(); 
val endgraphloading = System.currentTimeMillis; 

val startpr1 = System.currentTimeMillis; 
val prGraph = graph.staticPageRank(1) 
val endpr1 = System.currentTimeMillis; 

val startpr2 = System.currentTimeMillis; 
val prGraph = graph.staticPageRank(5) 
val endpr2 = System.currentTimeMillis; 

val loadingt = endgraphloading - startgraphloading; 
val firstt = endpr1 - startpr1 
val secondt = endpr2 - startpr2 

print(loadingt) 
print(firstt) 
print(secondt) 

,實際上只使用2-3個計算節點的RAM。這是對的嗎?只有1名工人比9名工人運行速度更快。

我正在使用spark獨立羣集模式。配置有問題嗎?

在此先感謝:)

回答

4

我看着火花代碼後發現這個問題。這是我使用graphx的腳本中的一個問題。

val graph = GraphLoader.edgeListFile(sc, "filepath").cache(); 

當我看着edgeListFile的構造函數時,它說minPartition = 1。我認爲這是一個最小分區,但它是你想要的分區大小。我將它設置爲節點數,即我想要的分區,然後完成。如graphx編程指南中提到的,如果您尚未從主分支構建spark 1.0,則需要注意的另一件事是。你應該使用你自己的partitionBy函數。如果圖形沒有正確分區,會導致一些問題。

我花了一段時間才知道這個,希望這個信息可以節省一些人的時間:)

相關問題