2017-05-12 88 views
0

我現在正在與django的prefetch相關的問題。 舉個例子,假設這些模型Django prefetch_related一個大型的數據集

from django.db import models 

class Client(models.Model): 
    name = models.CharField(max_length=255) 

class Purchase(models.Model): 
    client = models.ForeignKey('Client') 

讓我們想象一下,我們有幾個客戶,像200,但他們買了很多,所以我們有幾百萬購買的。

如果我要創建一個網頁上顯示所有的客戶和購買爲每個客戶端的數量,我會寫這樣的事情

from django.db.models import Prefetch 
from .models import Purchase, Client 

purchases = Purchase.objects.all() 
clients = Client.prefetch_related(Prefetch('purchase_set', queryset=purchases)) 

在這裏的問題是,我會查詢大單採購數據庫和該查詢可能需要超過一分鐘,或更糟的是在服務器上創建一個MemoryError。

於是,我試着用

purchases = Purchase.objects.all()[:9] 

只選擇一個批次的數據庫但我們可以預期,Django不喜歡它多,推出這種異常

Traceback (most recent call last): 
    File "project/venv/lib/python3.6/site-packages/django/core/handlers/base.py", 
line 149, in get_response 
    response = self.process_exception_by_middleware(e, request) 
    File "project/venv/lib/python3.6/site-packages/django/core/handlers/base.py", 
line 147, in get_response 
    response = wrapped_callback(request, *callback_args, **callback_kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/views/generic/base.py", 
line 68, in view 
    return self.dispatch(request, *args, **kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/utils/decorators.py", l 
ine 67, in _wrapper 
    return bound_func(*args, **kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/views/decorators/cache. 
py", line 57, in _wrapped_view_func 
    response = view_func(request, *args, **kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/utils/decorators.py", l 
ine 63, in bound_func 
    return func.__get__(self, type(self))(*args2, **kwargs2) 
****************** login decorators, views, ... 
    File "project/***.py", line ***, in *** 
    for client in clients: 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 258, in __iter__ 
    self._fetch_all() 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 1076, in _fetch_all 
    self._prefetch_related_objects() 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 656, in _prefetch_related_objects 
    prefetch_related_objects(self._result_cache, self._prefetch_related_lookups) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 1457, in prefetch_related_objects 
    obj_list, additional_lookups = prefetch_one_level(obj_list, prefetcher, lookup, level) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 1556, in prefetch_one_level 
    prefetcher.get_prefetch_queryset(instances, lookup.get_current_queryset(level))) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/fields/relate 
d_descriptors.py", line 539, in get_prefetch_queryset 
    queryset = queryset.filter(**query) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 790, in filter 
    return self._filter_or_exclude(False, *args, **kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 802, in _filter_or_exclude 
    "Cannot filter a query once a slice has been taken." 
AssertionError: Cannot filter a query once a slice has been taken. 

所以現在,我沒有真正的解決方案。我正在研究如何構建django/db/models/query.py:258中的__iter__函數,以嘗試創建一個具有相同行爲的函數,但需要預取中的有限集以便對它進行分頁,然後執行更平行的方式。

有沒有什麼「好方法」來做這些查詢?

回答

0

讓我們想象一下,我們有幾個客戶,像200,但他們買 了很多,所以我們有幾百萬購買的。

如果我要創建一個網頁上顯示的所有客戶端和 購買次數爲每個客戶端的,...

我要解釋你的問題,因爲想要這個功能。您是否嘗試過:

from django.db.models import Count 
clients = Client.objects.annotate(num_purchases=Count('purchase')) 
clients[0].num_purchases 

如果要排序,並獲得最高的採購客戶,你也可以這樣做:

clients = Client.objects.annotate(num_purchases=Count('purchase')).order_by('-num_purchases')[:5] 

實現更多的功能見https://docs.djangoproject.com/en/1.11/topics/db/aggregation/

+0

非常感謝你,正是我在找什麼,對不起,我沒有閱讀過手冊^^「 –