2014-01-13 18 views
0

如何緩存分頁的Django查詢集,特別是在ListView中?如何緩存分頁的Django查詢集

我注意到一個查詢需要很長時間才能運行,所以我試圖緩存它。該查詢集是巨大的(超過10萬條記錄),所以我試圖只緩存它的分頁子部分。我無法緩存整個視圖或模板,因爲有些部分是特定於用戶/會話的部分,需要不斷更改。

ListView有幾個檢索查詢集的標準方法,get_queryset()(返回非分頁數據)和paginate_queryset(),它們用當前頁面過濾它。

我首先嚐試緩存查詢get_queryset(),但很快實現調用cache.set(my_query_key, super(MyView, self).get_queryset())導致整個查詢被序列化。

於是我試圖重寫paginate_queryset()像:

import time 
from functools import partial 
from django.core.cache import cache 
from django.views.generic import ListView 

class MyView(ListView): 

    ... 

    def paginate_queryset(self, queryset, page_size): 
     cache_key = 'myview-queryset-%s-%s' % (self.page, page_size) 
     print 'paginate_queryset.cache_key:',cache_key 
     t0 = time.time() 
     ret = cache.get(cache_key) 
     if ret is None: 
      print 're-caching' 
      ret = super(MyView, self).paginate_queryset(queryset, page_size) 
      cache.set(cache_key, ret, 60*60) 
     td = time.time() - t0 
     print 'paginate_queryset.time.seconds:',td 
     (paginator, page, object_list, other_pages) = ret 
     print 'total objects:',len(object_list) 
     return ret 

然而,這需要差不多一分鐘的運行,即使被檢索只有10個對象,每個請求顯示「重新緩存」,這意味着沒有什麼是被保存到緩存。

settings.CACHE樣子:

CACHES = { 
    'default': { 
     'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', 
     'LOCATION': '127.0.0.1:11211', 
    } 
} 

service memcached status顯示memcached的運行和tail -f /var/log/memcached.log表示絕對沒有。

我在做什麼錯?緩存分頁查詢的正確方法是什麼,以便不檢索整個查詢集?

編輯:我認爲他們可能是memcached或Python包裝中的錯誤。 Django似乎支持兩種不同的memcached後端,一種使用python-memcached,一種使用pylibmc。 python-memcached似乎默默地隱藏了緩存paginate_queryset()值的錯誤。當我切換到pylibmc後端時,現在我收到一條明確的錯誤消息「error memcached_set:SERVER ERROR 10」,它回溯到第78行中的django/core/cache/backends/memcached.py。

+0

低級高速緩存API是否在django shell中工作?請參閱https://docs.djangoproject.com/en/1.6/topics/cache/#basic-usage – arocks

+0

是的,它似乎適用於簡單的不可變對象,甚至基本的查詢集。它只在緩存'paginate_queryset()'返回的值時纔會失敗。 – Cerin

回答

0

問題原來是一個綜合因素。主要是,paginate_queryset()返回的結果包含對無限查詢集的引用,這意味着它基本上是無法訪問的。當我調用cache.set(mykey, (paginator, page, object_list, other_pages))時,它試圖序列化數千條記錄,而不是我想要的記錄數,導致緩存項超出memcached的限制並失敗。

另一個因素是memcached/python-memcached中可怕的默認錯誤報告,它默默隱藏所有錯誤,並在發生錯誤時將cache.set()轉換爲nop,這使得追蹤非常耗時問題。

我解決了這個問題通過實質上改寫paginate_queryset()與完全拋棄Django的內置分頁程序功能和計算的queryset自己:

object_list = queryset[page_size*(page-1):page_size*(page-1)+page_size] 

,然後緩存object_list

1

您可以通過提供的cache_key來擴展Paginator以支持緩存。

有關這種CachedPaginator的使用和實現的博客帖子可以找到here。源代碼發佈在djangosnippets.org(這裏是web-acrhive link,因爲原文不起作用)。

但是我會從原始版本發佈一個稍微修改的例子,它不僅可以緩存每個頁面的對象,而且還可以記錄總數。 (有時甚至計數可能是一個昂貴的操作)。

from django.core.cache import cache 
from django.utils.functional import cached_property 
from django.core.paginator import Paginator, Page, PageNotAnInteger 


class CachedPaginator(Paginator): 
    """A paginator that caches the results on a page by page basis.""" 
    def __init__(self, object_list, per_page, orphans=0, allow_empty_first_page=True, cache_key=None, cache_timeout=300): 
     super(CachedPaginator, self).__init__(object_list, per_page, orphans, allow_empty_first_page) 
     self.cache_key = cache_key 
     self.cache_timeout = cache_timeout 

    @cached_property 
    def count(self): 
     """ 
      The original django.core.paginator.count attribute in Django1.8 
      is not writable and cant be setted manually, but we would like 
      to override it when loading data from cache. (instead of recalculating it). 
      So we make it writable via @cached_property. 
     """ 
     return super(CachedPaginator, self).count 

    def set_count(self, count): 
     """ 
      Override the paginator.count value (to prevent recalculation) 
      and clear num_pages and page_range which values depend on it. 
     """ 
     self.count = count 
     # if somehow we have stored .num_pages or .page_range (which are cached properties) 
     # this can lead to wrong page calculations (because they depend on paginator.count value) 
     # so we clear their values to force recalculations on next calls 
     try: 
      del self.num_pages 
     except AttributeError: 
      pass 
     try: 
      del self.page_range 
     except AttributeError: 
      pass 

    @cached_property 
    def num_pages(self): 
     """This is not writable in Django1.8. We want to make it writable""" 
     return super(CachedPaginator, self).num_pages 

    @cached_property 
    def page_range(self): 
     """This is not writable in Django1.8. We want to make it writable""" 
     return super(CachedPaginator, self).page_range 

    def page(self, number): 
     """ 
     Returns a Page object for the given 1-based page number. 

     This will attempt to pull the results out of the cache first, based on 
     the requested page number. If not found in the cache, 
     it will pull a fresh list and then cache that result + the total result count. 
     """ 
     if self.cache_key is None: 
      return super(CachedPaginator, self).page(number) 

     # In order to prevent counting the queryset 
     # we only validate that the provided number is integer 
     # The rest of the validation will happen when we fetch fresh data. 
     # so if the number is invalid, no cache will be setted 
     # number = self.validate_number(number) 
     try: 
      number = int(number) 
     except (TypeError, ValueError): 
      raise PageNotAnInteger('That page number is not an integer') 

     page_cache_key = "%s:%s:%s" % (self.cache_key, self.per_page, number) 
     page_data = cache.get(page_cache_key) 

     if page_data is None: 
      page = super(CachedPaginator, self).page(number) 
      #cache not only the objects, but the total count too. 
      page_data = (page.object_list, self.count) 
      cache.set(page_cache_key, page_data, self.cache_timeout) 
     else: 
      cached_object_list, cached_total_count = page_data 
      self.set_count(cached_total_count) 
      page = Page(cached_object_list, number, self) 

     return page