0

我很難在Spark上調試我的Spark 1.6.2應用程序。它運行在客戶端模式下。從本質上講,它鎖定時不會崩潰,並且鎖定時控制檯中的日誌如下所示。Apache Spark:掛在廣播上

17/03/31 20:12:02 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh007.prod.phx3.gdg:47579 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:03 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on p3plcdsh011.prod.phx3.gdg:63228 (size: 5.4 KB, free: 511.1 MB) 
    17/03/31 20:12:03 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on p3plcdsh015.prod.phx3.gdg:9377 (size: 5.4 KB, free: 511.1 MB) 
    17/03/31 20:12:03 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on p3plcdsh015.prod.phx3.gdg:61897 (size: 5.4 KB, free: 511.1 MB) 
    17/03/31 20:12:03 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh002.prod.phx3.gdg:23170 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:03 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on p3plcdsh016.prod.phx3.gdg:16649 (size: 5.4 KB, free: 511.1 MB) 
    17/03/31 20:12:04 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh003.prod.phx3.gdg:55147 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:04 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on p3plcdsh008.prod.phx3.gdg:7619 (size: 5.4 KB, free: 511.1 MB) 
    17/03/31 20:12:04 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh003.prod.phx3.gdg:40830 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:04 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh011.prod.phx3.gdg:20056 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:04 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh008.prod.phx3.gdg:47385 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:04 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh003.prod.phx3.gdg:2063 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:04 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh011.prod.phx3.gdg:63228 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:04 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh008.prod.phx3.gdg:64036 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:05 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh016.prod.phx3.gdg:16649 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:05 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh013.prod.phx3.gdg:31979 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:05 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh013.prod.phx3.gdg:18407 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:05 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh004.prod.phx3.gdg:45536 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:05 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh008.prod.phx3.gdg:50826 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh015.prod.phx3.gdg:36247 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh015.prod.phx3.gdg:22848 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh015.prod.phx3.gdg:9377 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh015.prod.phx3.gdg:61897 (size: 26.7 KB, free: 511.1 MB) 
    17/03/31 20:12:07 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on p3plcdsh008.prod.phx3.gdg:7619 (size: 26.7 KB, free: 511.1 MB) 

在Spark UI中,鎖定發生在地圖或過濾器函數中。

有沒有人看到這之前發生過或知道如何調試的情況?

看起來這可能是由於內存問題或空間問題,但沒有明確的跡象表明它是。我可以嘗試並將內存弄糟,看看它是否有幫助,但有人有調試小費嗎?

謝謝

+0

你在做什麼廣播? – Vidya

+0

調試問題它看起來像一個相當大的Java對象(某些文件是由未壓縮的文件300mb支持的)......但是它會序列化,否則我會看到有關序列化@Vidya的崩潰問題。是否有可以序列化的對象的大小限制,或者是否可以提高對象的最大大小? – adrian

回答

0

僅僅是可序列化是不夠的。這個問題可能有很多:你的序列化機制(Java序列化不好; Kryo好得多;等等),你的機器內存,確保你使用廣播值而不是包裝值等。

也有星火配置spark.sql.autoBroadcastJoinThreshold

配置的最大尺寸在執行連接時將被廣播到所有工作節點的表字節通過將該值設置爲-1廣播可以禁用注意。目前統計信息僅支持運行命令ANALYZE TABLE COMPUTE STATISTICS noscan的Hive Metastore表格。

默認爲10MB序列化。

最後,如果刪除默認的限制,你有足夠的內存,你還是比較喜歡尺寸比你大RDD /數據框,您可以用SizeEstimator檢查要少:

import org.apache.spark.util.SizeEstimator._ 

logInfo(estimate(rdd)) 

在如果情況變得更糟,我會考慮在轉換過程中從閃電般的緩存數據存儲中進行查找,而不是廣播該文件。