實現LRU緩存的最好方法

我想創建一個有效的LRU緩存實現。我發現最方便的方法是使用LinkedHashMap，但不幸的是，如果許多線程正在使用緩存，它會很慢。我的實現是在這裏：實現LRU緩存的最好方法

/** 
* Class provides API for FixedSizeCache. 
* Its inheritors represent classes   
* with concrete strategies  
* for choosing elements to delete 
* in case of cache overflow. All inheritors 
* must implement {@link #getSize(K, V)}. 
*/ 
public abstract class FixedSizeCache <K, V> implements ICache <K, V> { 
    /** 
    * Current cache size. 
    */ 
    private int currentSize; 


    /** 
    * Maximum allowable cache size. 
    */ 
    private int maxSize; 


    /** 
    * Number of {@link #get(K)} queries for which appropriate {@code value} was found. 
    */ 
    private int keysFound; 


    /** 
    * Number of {@link #get(K)} queries for which appropriate {@code value} was not found. 
    */ 
    private int keysMissed; 


    /** 
    * Number {@code key-value} associations that were deleted from cache 
    * because of cash overflow. 
    */ 
    private int erasedCount; 


    /** 
    * Basic data structure LinkedHashMap provides 
    * convenient way for designing both types of cache: 
    * LRU and FIFO. Depending on its constructor parameters 
    * it can represent either of FIFO or LRU HashMap. 
    */ 
    private LinkedHashMap <K, V> entries; 


    /** 
    * If {@code type} variable equals {@code true} 
    * then LinkedHashMap will represent LRU HashMap. 
    * And it will represent FIFO HashMap otherwise. 
    */ 
    public FixedSizeCache(int maxSize, boolean type) { 

     if (maxSize <= 0) { 
      throw new IllegalArgumentException("int maxSize parameter must be greater than 0"); 
     } 

     this.maxSize = maxSize; 
     this.entries = new LinkedHashMap<K, V> (0, 0.75f, type); 
    } 


    /** 
    * Method deletes {@code key-value} associations 
    * until current cache size {@link #currentSize} will become 
    * less than or equal to maximum allowable 
    * cache size {@link #maxSize} 
    */ 
    private void relaxSize() { 

     while (currentSize > maxSize) { 

      // The strategy for choosing entry with the lowest precedence 
      // depends on {@code type} variable that was used to create {@link #entries} variable. 
      // If it was created with constructor LinkedHashMap(int size,double loadfactor, boolean type) 
      // with {@code type} equaled to {@code true} then variable {@link #entries} represents 
      // LRU LinkedHashMap and iterator of its entrySet will return elements in order 
      // from least recently used to the most recently used. 
      // Otherwise, if {@code type} equaled to {@code false} then {@link #entries} represents 
      // FIFO LinkedHashMap and iterator will return its entrySet elements in FIFO order - 
      // from oldest in the cache to the most recently added. 

      Map.Entry <K, V> entryToDelete = entries.entrySet().iterator().next(); 

      if (entryToDelete == null) { 
       throw new IllegalStateException(" Implemented method int getSize(K key, V value) " + 
         " returns different results for the same arguments."); 
      } 

      entries.remove(entryToDelete.getKey()); 
      currentSize -= getAssociationSize(entryToDelete.getKey(), entryToDelete.getValue()); 
      erasedCount++; 
     } 

     if (currentSize < 0) { 
      throw new IllegalStateException(" Implemented method int getSize(K key, V value) " + 
        " returns different results for the same arguments."); 
     } 
    } 


    /** 
    * All inheritors must implement this method 
    * which evaluates the weight of key-value association. 
    * Sum of weights of all key-value association in the cache 
    * equals to {@link #currentSize}. 
    * But developer must ensure that 
    * implementation will satisfy two conditions: 
    * <br>1) method always returns non negative integers; 
    * <br>2) for every two pairs {@code key-value} and {@code key_1-value_1} 
    * if {@code key.equals(key_1)} and {@code value.equals(value_1)} then 
    * {@code getSize(key, value)==getSize(key_1, value_1)}; 
    * <br> Otherwise cache can work incorrectly. 
    */ 
    protected abstract int getSize(K key, V value); 


    /** 
    * Helps to detect if the implementation of {@link #getSize(K, V)} method 
    * can return negative values. 
    */ 
    private int getAssociationSize(K key, V value) { 

     int entrySize = getSize(key, value); 

     if (entrySize < 0) { 
      throw new IllegalStateException("int getSize(K key, V value) method implementation is invalid. It returned negative value."); 
     } 

     return entrySize; 
    } 


    /** 
    * Returns the {@code value} corresponding to {@code key} or 
    * {@code null} if {@code key} is not present in the cache. 
    * Increases {@link #keysFound} if finds a corresponding {@code value} 
    * or increases {@link #keysMissed} otherwise. 
    */ 
    public synchronized final V get(K key) { 

     if (key == null) { 
      throw new NullPointerException("K key is null"); 
     } 

     V value = entries.get(key); 
     if (value != null) { 
      keysFound++; 
      return value; 
     } 

     keysMissed++; 
     return value; 
    } 


    /** 
    * Removes the {@code key-value} association, if any, with the 
    * given {@code key}; returns the {@code value} with which it 
    * was associated, or {@code null}. 
    */ 
    public synchronized final V remove(K key) { 

     if (key == null) { 
      throw new NullPointerException("K key is null"); 
     } 

     V value = entries.remove(key); 

     // if appropriate value was present in the cache than decrease 
     // current size of cache 

     if (value != null) { 
      currentSize -= getAssociationSize(key, value); 
     } 

     return value; 
    } 


    /** 
    * Adds or replaces a {@code key-value} association. 
    * Returns the old {@code value} if the 
    * {@code key} was present; otherwise returns {@code null}. 
    * If after insertion of a {@code key-value} association 
    * to cache its size becomes greater than 
    * maximum allowable cache size then it calls {@link #relaxSize()} method which 
    * releases needed free space. 
    */ 
    public synchronized final V put(K key, V value) { 

     if (key == null || value == null) { 
      throw new NullPointerException("K key is null or V value is null"); 
     } 

     currentSize += getAssociationSize(key, value);  
     value = entries.put(key, value); 

     // if key was not present then decrease cache size 

     if (value != null) { 
      currentSize -= getAssociationSize(key, value); 
     } 

     // if cache size with new entry is greater 
     // than maximum allowable cache size 
     // then get some free space 

     if (currentSize > maxSize) { 
      relaxSize(); 
     } 

     return value; 
    } 


    /** 
    * Returns current size of cache. 
    */ 
    public synchronized int currentSize() { 
     return currentSize; 
    } 


    /** 
    * Returns maximum allowable cache size. 
    */ 
    public synchronized int maxSize() { 
     return maxSize; 
    } 


    /** 
    * Returns number of {@code key-value} associations that were deleted 
    * because of cache overflow. 
    */ 
    public synchronized int erasedCount() { 
     return erasedCount; 
    } 


    /** 
    * Number of {@link #get(K)} queries for which appropriate {@code value} was found. 
    */ 
    public synchronized int keysFoundCount() { 
     return keysFound; 
    } 


    /** 
    * Number of {@link #get(K)} queries for which appropriate {@code value} was not found. 
    */ 
    public synchronized int keysMissedCount() { 
     return keysMissed; 
    } 


    /** 
    * Removes all {@code key-value} associations 
    * from the cache. And turns {@link #currentSize}, 
    * {@link #keysFound}, {@link #keysMissed} to {@code zero}. 
    */ 
    public synchronized void clear() { 
     entries.clear(); 
     currentSize = 0; 
     keysMissed = 0; 
     keysFound = 0; 
     erasedCount = 0; 
    } 


    /** 
    * Returns a copy of {@link #entries} 
    * that has the same content. 
    */ 
    public synchronized LinkedHashMap<K, V> getCopy() { 
     return new LinkedHashMap<K, V> (entries); 
    } 
}

這個實現是很慢（因爲同步的），如果我們有多個線程試圖打電話讓說get()方法。有沒有更好的辦法？

來源

2013-04-11 Andrey Yaskulsky

你可以使用['ConcurrentHashMap'（http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html）呢？ – 2013-04-11 14:52:58

如果你還沒有測量它，你只是猜測。你能提供任何數字來證明它太慢嗎？你需要多快？ – 2013-04-11 15:04:37

你確定LHM是問題嗎？看起來你周圍添加的代碼要慢得多。 – 2013-04-11 15:09:03

我不知道這是有益的，但如果你能LinkedHashMap與ConcurrentHashMap取代，那麼你會提高你的吞吐量 - 一個ConcurrentHashMap採用分片允許多個同步讀者和作者。它也是線程安全的，所以你不需要同步你的讀者和作者。

除此之外，將關鍵字的使用替換爲ReadWriteLock。這將允許多個同時讀取。

來源

2013-04-11 14:54:37

java中的標準R/W鎖很糟糕，因爲它在元數據和展覽共享中寫入，所以性能不是很好。您可以使用LHM w/accessorder和R/W鎖，因爲讀取操作會改變地圖。 – bestsss 2013-04-11 15:46:32

儘量不要重複實現：Guava Caches。它幾乎具有您需要的所有功能：基於大小的驅逐，併發，加權。如果它適合您的需求使用它。如果不是嘗試實施你的，但總是首先評估（在我看來）。只是一個建議。

來源

2013-04-11 14:53:46

LinkedHashMap是建立，但番石榴不是。如何使用LHM重新實現？ – 2013-04-11 15:06:34

他不是隻用一個裸露的LHM，而是一個包裝，如果必須選擇，我會用番石榴而不是重新實現緩存。 – 2013-04-11 15:15:57

我同意，但他添加的代碼，他將不得不添加，因爲番石榴也不看對象的大小。 – 2013-04-11 15:20:11

您需要在我的2.5GHz的酷睿i5筆記本電腦上運行這樣

Map<Object, Object> map = Collections.synchronizedMap(new LinkedHashMap<Object, Object>(16, 0.7f, true) { 
    @Override 
    protected boolean removeEldestEntry(Map.Entry<Object, Object> eldest) { 
     return size() > 1000; 
    } 
}); 
Integer[] values = new Integer[10000]; 
for (int i = 0; i < values.length; i++) 
    values[i] = i; 

long start = System.nanoTime(); 
for (int i = 0; i < 1000; i++) { 
    for (int j = 0; j < values.length; j++) { 
     map.get(values[j]); 
     map.get(values[j/2]); 
     map.get(values[j/3]); 
     map.get(values[j/4]); 
     map.put(values[j], values[j]); 
    } 
} 
long time = System.nanoTime() - start; 
long rate = (5 * values.length * 1000) * 1000000000L/time; 
System.out.printf("Performed get/put operations at a rate of %,d per second%n", rate);

打印性能測試

Performed get/put operations at a rate of 27,170,035 per second

每秒多少百萬次操作，你需要什麼？

來源

2013-04-11 15:17:40

您沒有測試爭用（這是它傷害嚴重的地方），並且您擁有最便宜的hashCode，特別是equals。鎖將被偏向（或在測試中完全移除）。不敢說：但微不足道。 R/W鎖不會更好，不能用於訪問命令。 – bestsss 2013-04-11 15:44:08

它不是測試很多東西，但如果你只整天訪問同一張地圖，你應該只使用一個線程。多線程只會變慢。只有OP是一個真實的用例。我的觀點是你必須測試你的場景，以便了解性能會是什麼樣子。 – 2013-04-11 15:55:38

LRU方案在其自己的權利中涉及獨佔修改共享結構。所以爭論就已經發生了，沒有什麼可以做的了。

如果您不需要嚴格的LRU並且可以容忍驅逐策略的某些不一致性，那麼這些東西會查找並且更加明亮。您的條目（值包裝）需要一些使用情況統計信息，並且您需要一些基於上述使用情況統計信息的到期策略。

然後，您可以使用基於ConcurrentSkipListMap的LRU相似結構（即您可以將其視爲數據庫索引），當緩存即將過期時，使用該索引並根據它過期元素。你將需要雙重檢查等，但它是可行的。更新索引是免費的，但規模相當。請記住，ConcurrentSkipListMap.size()是一個昂貴的操作O（n），所以你不應該依賴。實現並不難，但不是小事或者，除非你有足夠的同步爭（芯）（LHM）可能是最簡單的方法。

來源

2013-04-11 15:59:26 bestsss

如前所述，問題的主要原因是更新LRU算法中的共享數據結構。爲了克服這個問題，你可以使用分區，或者使用另一個驅逐算法，然後使用LRU。現代算法的執行效率比LRU好。請參閱我在cache2k benchmarks page上關於該主題的比較。

的cache2k驅逐實現的時鐘和-PRO具有完全的讀併發無鎖。

來源

2014-02-25 18:38:57 cruftex

實現LRU緩存的最好方法

回答

相關問題