Spark的當前驅逐策略是什麼？ FIFO還是LRU？

我是Spark新手。我試圖找出Spark的驅逐策略，有些人說它是LRU，例如，this article和this one。Spark的當前驅逐策略是什麼？ FIFO還是LRU？

然而，當我看着的MemoryStore和BlockManager的源代碼，我找不到LRU的邏輯：

有LinkedHashMap中記錄的所有塊在MemoryStore的

// Note: all changes to memory allocations, notably putting blocks, evicting blocks, and 
// acquiring or releasing unroll memory, must be synchronized on `memoryManager`! 
private val entries = new LinkedHashMap[BlockId, MemoryEntry[_]](32, 0.75f, true)

被訪問塊時，它不會被移動到的LinkedHashMap的頭

def getValues(blockId: BlockId): Option[Iterator[_]] = { 
    val entry = entries.synchronized { entries.get(blockId) } 
    entry match { 
     case null => None 
     case e: SerializedMemoryEntry[_] => 
     throw new IllegalArgumentException("should only call getValues on deserialized blocks") 
     case DeserializedMemoryEntry(values, _, _) => 
     val x = Some(values) 
     x.map(_.iterator) 
    } 
}

，所選塊在LinkedHashMap中的的entrySet，我認爲的順序是先入和後入先出

private[spark] def evictBlocksToFreeSpace(
    blockId: Option[BlockId], 
    space: Long, 
    memoryMode: MemoryMode): Long = { 
    assert(space > 0) 
    memoryManager.synchronized { 
    var freedMemory = 0L 
    val rddToAdd = blockId.flatMap(getRddId) 
    val selectedBlocks = new ArrayBuffer[BlockId] 
    def blockIsEvictable(blockId: BlockId, entry: MemoryEntry[_]): Boolean = { 
     entry.memoryMode == memoryMode && (rddToAdd.isEmpty || rddToAdd != getRddId(blockId)) 
    } 
    // This is synchronized to ensure that the set of entries is not changed 
    // (because of getValue or getBytes) while traversing the iterator, as that 
    // can lead to exceptions. 
    entries.synchronized { 
     val iterator = entries.entrySet().iterator() 
     while (freedMemory < space && iterator.hasNext) { 
     val pair = iterator.next() 
     val blockId = pair.getKey 
     val entry = pair.getValue 
     if (blockIsEvictable(blockId, entry)) { 
      // We don't want to evict blocks which are currently being read, so we need to obtain 
      // an exclusive write lock on blocks which are candidates for eviction. We perform a 
      // non-blocking "tryLock" here in order to ignore blocks which are locked for reading: 
      if (blockInfoManager.lockForWriting(blockId, blocking = false).isDefined) { 
      selectedBlocks += blockId 
      freedMemory += pair.getValue.size 
      } 
     } 
     } 
    } 
    ... 
    if (freedMemory >= space) { 
     logInfo(s"${selectedBlocks.size} blocks selected for dropping " + 
     s"(${Utils.bytesToString(freedMemory)} bytes)") 
     for (blockId <- selectedBlocks) { 
     val entry = entries.synchronized { entries.get(blockId) } 
     // This should never be null as only one task should be dropping 
     // blocks and removing entries. However the check is still here for 
     // future safety. 
     if (entry != null) { 
      dropBlock(blockId, entry) 
     } 
     } 
     ... 
    } 
    } 
}

因此，驅逐Spark的策略是FIFO還是LRU？

來源

2016-11-16 leon

-1

我以前有同樣的問題，但答案相當棘手：從您粘貼在這裏的代碼，沒有明確的「促進」操作。但實際上，「LinkedHashMap」是一種確保LRU順序的特殊數據結構。

來源

2017-02-13 19:49:47 fewfsdiv823rfwef1

你在這行有LinkedHashMap中的構造器： 私人VAL項=新的LinkedHashMap [塊標識，MemoryEntry [_]（32，0.75f，真） 是一個構造在LinkedHashMap的創建訪問順序： LinkedHashMap（int initialCapacity，float loadFactor，boolean accessOrder） 布爾值設置爲true，這表示按照最近訪問次數最少的訪問順序對鍵進行排序。

來源

2017-02-24 10:18:13

這樣回答問題的方式是什麼？ – luk2302

再次閱讀我的答案！我說：「布爾值被設置爲true，這意味着鍵是根據最近訪問過的最近訪問的訪問順序排序的。」所以驅逐策略是LRU。這些塊根據其在條目linkedHashMap中的訪問順序進行排序。所選擇的驅逐塊按照LinkedHashMap的entrySet的順序，這意味着要被驅逐的第一個塊是最近最少使用的塊 –

這個答案是有幫助和正確的，謝謝niko – leon

Spark的當前驅逐策略是什麼？ FIFO還是LRU？

回答

相關問題