2015-07-04 43 views
2

我在Spark上做了一些字符串處理。我的代碼片段:爲什麼相同任務集中的任務具有非常不同的執行時間?

val rdd = sc.objectFile[(String, String)]("some hdfs url", 1); 
rdd.cache.count // let cache happen 

val combOp = (f: List[String], g: List[String]) => { 
    for (x <- f) { 
    finder.processEntry(x) 
    } 
    for (x <- g) { 
    finder.processEntry(x) 
    } 
    finder.result 
} 


val res = rdd.mapPartitions(x => { 
    for (e<-x) { 
    finder.processEntry(e) 
    } 
    Iterator(finder.result) 
}, true).reduce(combOp) 

我擁有的數據集大約爲10GB。我在24核心機器上運行Spark,內存爲48GB。配置文件:

spark.driver.memory 1g 
spark.executor.memory 30g 
spark.executor.extraJavaOptions -Xloggc:/var/log/gcmemory.log -XX:+PrintGCDetails 
spark.executor.cores 4 

執行日誌片斷:

INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, 10.60.1.143, ANY, 1642 bytes) 
INFO BlockManagerMasterEndpoint: Registering block manager 10.60.1.143:42850 with 15.5 GB RAM, BlockManagerId(0, 10.60.1.143, 42850) 
INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.60.1.143:42850 (size: 1766.0 B, free: 15.5 GB) 
INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.60.1.143:42850 (size: 16.8 KB, free: 15.5 GB) 
INFO BlockManagerInfo: Added rdd_1_3 in memory on 10.60.1.143:42850 (size: 219.7 MB, free: 15.3 GB) 
INFO BlockManagerInfo: Added rdd_1_1 in memory on 10.60.1.143:42850 (size: 229.7 MB, free: 15.1 GB) 
INFO BlockManagerInfo: Added rdd_1_2 in memory on 10.60.1.143:42850 (size: 221.5 MB, free: 14.9 GB) 
INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 6345 ms on 10.60.1.143 (1/34) 
INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 6351 ms on 10.60.1.143 (2/34) 
INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 6354 ms on 10.60.1.143 (3/34) 
INFO BlockManagerInfo: Added rdd_1_0 in memory on 10.60.1.143:42850 (size: 220.6 MB, free: 14.7 GB) 
INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 6454 ms on 10.60.1.143 (4/34) 
INFO BlockManagerInfo: Added rdd_1_5 in memory on 10.60.1.143:42850 (size: 219.9 MB, free: 14.4 GB) 
INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 2287 ms on 10.60.1.143 (5/34) 
INFO BlockManagerInfo: Added rdd_1_4 in memory on 10.60.1.143:42850 (size: 222.7 MB, free: 14.2 GB) 
INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, 10.60.1.143, ANY, 1642 bytes) 
INFO BlockManagerInfo: Added rdd_1_6 in memory on 10.60.1.143:42850 (size: 210.7 MB, free: 14.0 GB) 
INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 2350 ms on 10.60.1.143 (6/34) 
INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 2356 ms on 10.60.1.143 (7/34) 
INFO BlockManagerInfo: Added rdd_1_7 in memory on 10.60.1.143:42850 (size: 214.6 MB, free: 13.8 GB) 
INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 2289 ms on 10.60.1.143 (8/34) 
INFO BlockManagerInfo: Added rdd_1_8 in memory on 10.60.1.143:42850 (size: 216.3 MB, free: 13.6 GB) 
INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 2430 ms on 10.60.1.143 (9/34) 
INFO BlockManagerInfo: Added rdd_1_11 in memory on 10.60.1.143:42850 (size: 216.5 MB, free: 13.4 GB) 
INFO BlockManagerInfo: Added rdd_1_10 in memory on 10.60.1.143:42850 (size: 216.5 MB, free: 13.2 GB) 
INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 11) in 2416 ms on 10.60.1.143 (10/34) 
INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 10) in 2445 ms on 10.60.1.143 (11/34) 
INFO BlockManagerInfo: Added rdd_1_9 in memory on 10.60.1.143:42850 (size: 231.4 MB, free: 12.9 GB) 
INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 15, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 2528 ms on 10.60.1.143 (12/34) 
INFO BlockManagerInfo: Added rdd_1_12 in memory on 10.60.1.143:42850 (size: 217.3 MB, free: 12.7 GB) 
INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 16, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 12) in 1797 ms on 10.60.1.143 (13/34) 
INFO BlockManagerInfo: Added rdd_1_14 in memory on 10.60.1.143:42850 (size: 215.8 MB, free: 12.5 GB) 
INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 17, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 14.0 in stage 0.0 (TID 14) in 1748 ms on 10.60.1.143 (14/34) 
INFO BlockManagerInfo: Added rdd_1_13 in memory on 10.60.1.143:42850 (size: 220.9 MB, free: 12.3 GB) 
INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 18, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 13) in 1812 ms on 10.60.1.143 (15/34) 
INFO BlockManagerInfo: Added rdd_1_15 in memory on 10.60.1.143:42850 (size: 233.8 MB, free: 12.1 GB) 
INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 19, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 15.0 in stage 0.0 (TID 15) in 1756 ms on 10.60.1.143 (16/34) 
INFO BlockManagerInfo: Added rdd_1_16 in memory on 10.60.1.143:42850 (size: 221.6 MB, free: 11.9 GB) 
INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 20, 10.60.1.143, ANY, 1642 bytes) 
INFO TaskSetManager: Finished task 16.0 in stage 0.0 (TID 16) in 2600 ms on 10.60.1.143 (17/34) 

在相同的任務集中的第一個運動員如何執行比後者更長的選手?很感謝任何形式的幫助。

回答

0

對於某些分區比其他分區需要更長時間的分散程序(或執行程序)的常見原因是分區數據不均勻。我建議嘗試重新分區您的數據。 Spark用戶界面也可能有一些有用的信息(您可以查看輸入大小等)。有些機器由於隨機原因(特別是在某些機器上可能存在噪聲鄰居的虛擬化環境中很常見),速度較慢,您可以嘗試啓用推測性執行(請參閱https://spark.apache.org/docs/latest/configuration.html)/設置spark.speculation標誌,以便Spark在另一個執行器上嘗試並解決問題,如果它恰好在一臺計算機上運行緩慢。

+0

我確定這些數據是均勻分區的,並且它運行在一臺SMP計算機上,因此沒有噪聲。我不知道它可能是JVM類加載器的開銷。 – Amos

相關問題