I have two related models with 10 million rows each and you want to execute an effective paginated query of 50,000 elements in one of them and access the related data on the other:
class RnaPrecomputed(models.Model):
id = models.CharField(max_length=22, primary_key=True)
rna = models.ForeignKey('Rna', db_column='upi', to_field='upi', related_name='precomputed')
description = models.CharField(max_length=250)
class Rna(models.Model):
id = models.IntegerField(db_column='id')
upi = models.CharField(max_length=13, db_index=True, primary_key=True)
timestamp = models.DateField()
userstamp = models.CharField(max_length=30)
As you can see, it RnaPrecomputedis connected RNAwith a foreign key. Now I want to get a specific page of 50,000 elements RnaPrecomputedand RNArelated ones. I expect the problem with an N + 1 request if I do this without a call select_related(). Here are the timings:
First, for reference, I will not touch on the related model at all:
rna_paginator = paginator.Paginator(RnaPrecomputed.objects.all(), 50000)
message = ""
for object in rna_paginator.page(400).object_list:
message = message + str(object.id)
accepts:
real 0m12.614s
user 0m1.073s
sys 0m0.188s
Now I will try to access data on the associated model:
rna_paginator = paginator.Paginator(RnaPrecomputed.objects.all(), 50000)
message = ""
for object in rna_paginator.page(400).object_list:
message = message + str(object.rna.upi)
required:
real 2m27.655s
user 1m20.194s
sys 0m4.315s
This is a lot, so maybe I have a problem with N + 1 queries.
, select_related(),
rna_paginator = paginator.Paginator(RnaPrecomputed.objects.all().select_related('rna'), 50000)
message = ""
for object in rna_paginator.page(400).object_list:
message = message + str(object.rna.upi)
:
real 7m9.720s
user 0m1.948s
sys 0m0.337s
, - select_related() 3 , , . , , N + 1 , RnaPrecomputed Django ORM, , RNA?
select_related() ?