Count vs len on QuerySet Django

In Django, given that I have a QuerySet that I'm going to repeat and print the results of, what is the best option for counting objects? len(qs) or qs.count() ?

(Also considering that counting objects in one iteration is not an option.)

+64
performance python django
Jan 14 '13 at 21:29
source share
3 answers

Although Django Docs recommends using count rather than len :

Note. Do not use len() in QuerySets if all you need to do is determine the number of records in the set. It is much more efficient to process the account at the database level using SQL SELECT COUNT(*) , and Django provides the count() method for this reason.

Since you iterate this QuerySet anyway, the result will be cached (if you are not using iterator ), and therefore it will be preferable to use len , as this will avoid re-entering the database, and possibly also getting a different amount of results. !).
If you are using an iterator , then I would suggest including a counting variable during iteration (rather than using a counter) for the same reasons.

+96
Jan 14 '13 at 21:32
source share

The choice between len() and count() depends on the situation, and you should deeply understand how they work in order to use them correctly.

Let me introduce you a few scenarios:

  1. (most important) If you want to know only the number of elements and do not plan to handle them in any way, it is extremely important to use count() :

    DO: queryset.count() - this will execute one SELECT COUNT(*) some_table , all calculations are performed on the DBMS side, Python just needs to get the result number with a fixed cost O (1)

    DON'T: len(queryset) - this will execute a SELECT * FROM some_table , retrieving the entire O (N) table and requiring additional O (N) memory to store it. This is the worst thing you can do.

  2. When you intend to extract a set of queries in any case, it is better to use len() which will not cause an additional query to the database, since count() will be:

     len(queryset) # fetching all the data - NO extra cost - data would be fetched anyway in the for loop for obj in queryset: # data is already fetched by len() - using cache pass 

    Amount:

     queryset.count() # this will perform an extra db query - len() did not for obj in queryset: # fetching data pass 
  3. The second case returned (when a set of requests has already been received):

     for obj in queryset: # iteration fetches the data len(queryset) # using already cached data - O(1) no extra cost queryset.count() # using cache - O(1) no extra db query len(queryset) # the same O(1) queryset.count() # the same: no query, O(1) 

Everything will be clear as soon as you look "under the hood":

 class QuerySet(object): def __init__(self, model=None, query=None, using=None, hints=None): # (...) self._result_cache = None def __len__(self): self._fetch_all() return len(self._result_cache) def _fetch_all(self): if self._result_cache is None: self._result_cache = list(self.iterator()) if self._prefetch_related_lookups and not self._prefetch_done: self._prefetch_related_objects() def count(self): if self._result_cache is not None: return len(self._result_cache) return self.query.get_count(using=self.db) 

Good links in Django docs:

+25
Oct 07 '17 at 11:02
source share

I think using len(qs) makes more sense here, since you need to iterate over the results. qs.count() is the best option if all you want to do is print a counter and not iterate over the results.

len(qs) will hit the database with select * from table , while qs.count() will hit db with select count(*) from table .

also qs.count() will give an integer return and you cannot qs.count() over it

+23
Jan 14 '13 at 21:46
source share



All Articles