2016-06-28 58 views

回答

1

我一直在研究這個今天,似乎RDD塊是RDD塊和非RDD塊的總和。 退房代碼爲: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/exec/ExecutorsPage.scala

val rddBlocks = status.numBlocks 

如果你去到Apache星火回購的Github上的鏈接如下: https://github.com/apache/spark/blob/d5b1d5fc80153571c308130833d0c0774de62c92/core/src/main/scala/org/apache/spark/storage/StorageUtils.scala

你會發現下面的代碼行:

 /** 
    * Return the number of blocks stored in this block manager in O(RDDs) time. 
    * 
    * @note This is much faster than `this.blocks.size`, which is O(blocks) time. 
    */ 
    def numBlocks: Int = _nonRddBlocks.size + numRddBlocks 

非rdd塊是由廣播變量創建的塊,因爲它們作爲緩存塊存儲在內存中。驅動程序通過廣播變量將這些任務發送給執行者。 現在這些系統創建的廣播變量將通過ContextCleaner服務刪除,因此相應的非RDD塊將被刪除。 RDD塊通過rdd.unpersist()未被執行。

相關問題