2016-01-20 92 views
14

RDD具有512個相同大小的分區,並且在512個執行程序中100%緩存在內存中。爲什麼Spark任務需要很長時間才能在本地查找塊?

我有512個任務的filter-map-collect作業。有時候這項工作完成亞秒。在其他情況下,50%的任務完成亞秒,45%的任務需要10秒,5%的任務需要20秒。

下面是一個執行日誌,可在任務了20秒:

15/12/16 09:44:37 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 5312 
15/12/16 09:44:37 INFO executor.Executor: Running task 215.0 in stage 17.0 (TID 5312) 
15/12/16 09:44:37 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 10 
15/12/16 09:44:37 INFO storage.MemoryStore: ensureFreeSpace(1777) called with curMem=908793307, maxMem=5927684014 
15/12/16 09:44:37 INFO storage.MemoryStore: Block broadcast_10_piece0 stored as bytes in memory (estimated size 1777.0 B, free 4.7 GB) 
15/12/16 09:44:37 INFO broadcast.TorrentBroadcast: Reading broadcast variable 10 took 186 ms 
15/12/16 09:44:37 INFO storage.MemoryStore: ensureFreeSpace(3272) called with curMem=908795084, maxMem=5927684014 
15/12/16 09:44:37 INFO storage.MemoryStore: Block broadcast_10 stored as values in memory (estimated size 3.2 KB, free 4.7 GB) 
15/12/16 09:44:57 INFO storage.BlockManager: Found block rdd_5_215 locally 
15/12/16 09:44:57 INFO executor.Executor: Finished task 215.0 in stage 17.0 (TID 5312). 2074 bytes result sent to driver 

所以出現20秒花在尋找局部塊。查看其他慢速任務的日誌表明,由於相同的原因,它們都被延遲。我的理解是,一個本地塊意味着在同一個JVM實例內,所以我不明白爲什麼需要這麼長時間才能找到它。

由於滯後總是正好10秒或正好20秒,我懷疑這是由於某些聽衆10秒鐘超時,或類似的情況。如果這是真的,那麼我想我的選擇要麼找出爲什麼它超時並修復它或縮短超時,以便更頻繁地嘗試。

爲什麼任務需要很長時間才能找到本地塊,我該如何解決這個問題?

更新:添加DEBUG日誌org.apache.spark.storage

16/02/01 12:14:07 INFO CoarseGrainedExecutorBackend: Got assigned task 3029 
16/02/01 12:14:07 INFO Executor: Running task 115.0 in stage 9.0 (TID 3029) 
16/02/01 12:14:07 DEBUG Executor: Task 3029's epoch is 1 
16/02/01 12:14:07 DEBUG BlockManager: Getting local block broadcast_6 
16/02/01 12:14:07 DEBUG BlockManager: Block broadcast_6 not registered locally 
16/02/01 12:14:07 INFO TorrentBroadcast: Started reading broadcast variable 6 
16/02/01 12:14:07 DEBUG TorrentBroadcast: Reading piece broadcast_6_piece0 of broadcast_6 
16/02/01 12:14:07 DEBUG BlockManager: Getting local block broadcast_6_piece0 as bytes 
16/02/01 12:14:07 DEBUG BlockManager: Block broadcast_6_piece0 not registered locally 
16/02/01 12:14:07 DEBUG BlockManager: Getting remote block broadcast_6_piece0 as bytes 
16/02/01 12:14:07 DEBUG BlockManager: Getting remote block broadcast_6_piece0 from BlockManagerId(385, node1._.com, 54162) 
16/02/01 12:14:07 DEBUG TransportClient: Sending fetch chunk request 0 to node1._.com:54162 
16/02/01 12:14:07 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 2017.0 B, free 807.3 MB) 
16/02/01 12:14:07 DEBUG BlockManagerMaster: Updated info of block broadcast_6_piece0 
16/02/01 12:14:07 DEBUG BlockManager: Told master about block broadcast_6_piece0 
16/02/01 12:14:07 DEBUG BlockManager: Put block broadcast_6_piece0 locally took 2 ms 
16/02/01 12:14:07 DEBUG BlockManager: Putting block broadcast_6_piece0 without replication took 2 ms 
16/02/01 12:14:07 INFO TorrentBroadcast: Reading broadcast variable 6 took 87 ms 
16/02/01 12:14:07 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 3.6 KB, free 807.3 MB) 
16/02/01 12:14:07 DEBUG BlockManager: Put block broadcast_6 locally took 1 ms 
16/02/01 12:14:07 DEBUG BlockManager: Putting block broadcast_6 without replication took 1 ms 
16/02/01 12:14:17 DEBUG CacheManager: Looking for partition rdd_5_115 
16/02/01 12:14:17 DEBUG BlockManager: Getting local block rdd_5_115 
16/02/01 12:14:17 DEBUG BlockManager: Level for block rdd_5_115 is StorageLevel(false, true, false, true, 1) 
16/02/01 12:14:17 DEBUG BlockManagerSlaveEndpoint: removing broadcast 4 
16/02/01 12:14:17 DEBUG BlockManager: Getting block rdd_5_115 from memory 
16/02/01 12:14:17 DEBUG BlockManager: Removing broadcast 4 
16/02/01 12:14:17 INFO BlockManager: Found block rdd_5_115 locally 
16/02/01 12:14:17 DEBUG BlockManager: Removing block broadcast_4 
16/02/01 12:14:17 DEBUG MemoryStore: Block broadcast_4 of size 3680 dropped from memory (free 5092230668) 
16/02/01 12:14:17 DEBUG BlockManager: Removing block broadcast_4_piece0 
16/02/01 12:14:17 DEBUG MemoryStore: Block broadcast_4_piece0 of size 2017 dropped from memory (free 5092232685) 
16/02/01 12:14:17 DEBUG BlockManagerMaster: Updated info of block broadcast_4_piece0 
16/02/01 12:14:17 DEBUG BlockManager: Told master about block broadcast_4_piece0 
16/02/01 12:14:17 DEBUG BlockManagerSlaveEndpoint: Done removing broadcast 4, response is 2 
16/02/01 12:14:17 DEBUG BlockManagerSlaveEndpoint: Sent response: 2 to node2._.com:45115 
16/02/01 12:14:17 INFO Executor: Finished task 115.0 in stage 9.0 (TID 3029). 2164 bytes result sent to driver 
16/02/01 12:14:17 DEBUG BlockManagerSlaveEndpoint: removing broadcast 5 
16/02/01 12:14:17 DEBUG BlockManager: Removing broadcast 5 
16/02/01 12:14:17 DEBUG BlockManager: Removing block broadcast_5_piece0 
16/02/01 12:14:17 DEBUG MemoryStore: Block broadcast_5_piece0 of size 2017 dropped from memory (free 5092234702) 
16/02/01 12:14:17 DEBUG BlockManagerMaster: Updated info of block broadcast_5_piece0 
16/02/01 12:14:17 DEBUG BlockManager: Told master about block broadcast_5_piece0 
16/02/01 12:14:17 DEBUG BlockManager: Removing block broadcast_5 
16/02/01 12:14:17 DEBUG MemoryStore: Block broadcast_5 of size 3680 dropped from memory (free 5092238382) 
16/02/01 12:14:17 DEBUG BlockManagerSlaveEndpoint: Done removing broadcast 5, response is 2 
16/02/01 12:14:17 DEBUG BlockManagerSlaveEndpoint: Sent response: 2 to node2._.com:45115 
+1

你可以打開'org.apache.spark.storage'包上的日誌記錄級別並分享結果嗎?我檢查了BlockManager代碼,在'doGetLocal'方法中可能會發生很多事情,並且有調試級別的日誌條目,這有助於理解它到底在做什麼。順便說一下,'本地發現塊rdd_5_215'意味着它在本地BlockManager中找到它(不在遠程塊中),但它可能從內存或磁盤或外部存儲獲取塊。 –

+0

謝謝@AlexLarikov,我添加了DEBUG日誌。當我查看Spark Web UI時,它告訴我RDD在內存中被100%緩存。既然如此,Spark是否會從磁盤中檢索一個塊還是合理的? – user2179977

回答

0

,似乎站出來給我的唯一的事情是,你必須複製通過您的存儲級別StorageLevel(false, true, false, true, 1)

開啓既然你有跨越512個執行人它可以跨遺囑執行人被複制塊512個分區,這可能會導致最終的放緩。我會嘗試關閉複製並查看它對性能的影響。

0

你分配給你的Spark應用程序的核心總數是多少?如果您分配256個內核並且spark.locality.wait的值爲10,則可能會發生這種情況。

我不知道您的環境,但看起來您的執行程序太多了。只有少數執行者(取決於你的計算節點的強大程度),併爲每個執行者提供多個內核。簡而言之,不是每個線程都有很多進程,而是每個進程都有多個線程。

相關問題