Django: queryset.count () is much slower than chained filters than single filters regardless of the returned request size - is there a solution?

EDIT: The best solution thanks to Hakan -

queriedForms.filter(pk__in=list(formtype.form_set.all().filter(formrecordattributevalue__record_value__contains=constraint['TVAL'], formrecordattributevalue__record_attribute_type__pk=rtypePK).values_list('pk', flat=True))).count()

I tried more of my suggestions, but I can’t avoid INNER JOIN - it seems like a stable solution, which is not enough for me, but the predicted speed is increasing in all directions. See his answer for more details!


I was struggling with a problem that I did not see on the Internet.

When connecting two filters in Django, for example.

masterQuery = bigmodel.relatedmodel_set.all()
masterQuery = masterQuery.filter(name__contains="test")
masterQuery.count() 
#returns 100,000 results in < 1 second
#test filter--all 100,000+ names have "test x" where x is 0-9 
storedCount = masterQuery.filter(name__contains="9").count()
#returns ~50,000 results but takes 5-6 seconds

Trying a little different:

masterQuery = masterQuery.filter(name__contains="9")
masterQuery.count()
#also returns ~50,000 results in 5-6 seconds

merging and merging seems to improve performance a bit, like

masterQuery = bigmodel.relatedmodel_set.all()
masterQuery = masterQuery.filter(name__contains="test") 
(masterQuery & masterQuery.filter(name__contains="9")).count()

It seems that the counter takes significantly longer outside of one filter in the query set.

, - mySQL, , -, , , , mySQL, , SELECT COUNT (*) django

, : ? ( ) . 100 000 , , 100 000. , , len(), 5 , 40 , 3 + - , . - , ?

EDIT: - time.clock() .3 () - django - 5-6

EDIT2: , , :

mainQuery = masterQuery = bigmodel.relatedmodel_set.all()
mainQuery = mainQuery.filter(reverseforeignkeytestmodel__record_value__contains="test", reverseforeignkeytestmodel__record_attribute_type__pk=1)
#Where "record_attribute_type" is another foreign key being used as a filter
mainQuery.count() #produces 100,000 results in < 1sec
mainQuery.filter(reverseforeignkeytestmodel__record_value__contains="9", reverseforeignkeytestmodel__record_attribute_type__pk=5).count()
#produces ~50,000 results in 5-6 secs

, , (, ), , - . .

3: , , . < 10 000 , - . 10000 ~ 1 , 5000 ~ 1

4: @Hakan

mainQuery = bigmodel.relatedmodel_set.all()
#Setup the first filter as normal
mainQuery = mainQuery.filter(reverseforeignkeytestmodel__record_value__contains="test", reverseforeignkeytestmodel__record_attribute_type__pk=1)

#Grab a values list for the second chained filter instead of chaining it    
values = bigmodel.relatedmodel_set.all().filter(reverseforeignkeytestmodel__record_value__contains="test", reverseforeignkeytestmodel__record_attribute_type__pk=8).values_list('pk', flat=True)
#filter the first query based on the values_list rather than a second filter
mainQuery = mainQuery.filter(pk__in=values)
mainQuery.count()
#Still takes on average the same amount of time after enough test runs--seems to be slightly faster than average--similar to the (quersetA & querysetB) merge solution I tried.

, , value_list, . . - , ,

5: @Hakan

mainQuery.filter(pk__in=list(formtype.form_set.all().filter(formrecordattributevalue__record_value__contains=constraint['TVAL'], formrecordattributevalue__record_attribute_type__pk=rtypePK).values_list('pk', flat=True))).count()

, -, , . > 50 000, , . < 50 000 - , < 1 - 2-3 1 , 1 . , - .

, , . (, ), , .

+4
1

, , mysql .

API QuerySet .

( , !). , MySQL, . . :

values = Blog.objects.filter(
    name__contains='Cheddar').values_list('pk', flat=True) 
entries = Entry.objects.filter(blog__in=list(values)) 

list() QuerySet Blog, . , QuerySets .

, , , - :

masterQuery = bigmodel.relatedmodel_set.all()
pks = list(masterQuery.filter(name__contains="test").values_list('pk', flat=True))
count = masterQuery.filter(pk__in=pks, name__contains="9")

MySQL , , Python, .

names = masterQuery.filter(name__contains='test').values_list('name')
count = sum('9' in n for n in names)

: , , sql JOIN. , , .

, - . , .

# query only RelatedModel, avoid JOIN
related_pks = RelatedModel.objects.filter(
     record_value__contains=constraint['TVAL'],
     record_attribute_type=rtypePK,
).values_list('pk', flat=True)

# list(queryset) will do a database query, resulting in a list of integers.
pks_list = list(related_pks)

# use that result to filter your main model. 
count = MainModel.objects.filter(
     formrecordattributevalue__in=pks_list
).count()

, MainModel to RelatedModel.

+1

Source: https://habr.com/ru/post/1680979/


All Articles