Update multiple objects in Django at once?

I am using Django 1.9. I have a Django table that represents the value of a specific measure, by month, with the original values ​​and percentiles:

class MeasureValue(models.Model): org = models.ForeignKey(Org, null=True, blank=True) month = models.DateField() calc_value = models.FloatField(null=True, blank=True) percentile = models.FloatField(null=True, blank=True) 

Typically, 10,000 people a month. My question is whether I can speed up the process of setting values ​​on models.

I am currently calculating percentiles, getting all measured values ​​for a month, using the Django filter query, converting it to the pandas framework, and then using scipy rankdata to set ranks and percentiles. I do this because pandas and rankdata efficient, able to ignore null values, and able to handle duplicate values ​​the way I want, so I'm happy with this method:

 records = MeasureValue.objects.filter(month=month).values() df = pd.DataFrame.from_records(records) // use calc_value to set percentile on each row, using scipy rankdata 

However, I then need to get each percentile value from the data frame and set it again on model instances. Now I am doing this by iterating over the data rows and updating each instance:

 for i, row in df.iterrows(): mv = MeasureValue.objects.get(org=row.org, month=month) if (row.percentile is None) or np.isnan(row.percentile): row.percentile = None mv.percentile = row.percentile mv.save() 

This is not surprisingly rather slow. Is there an effective way for Django to speed it up by creating a single database, rather than tens of thousands? I checked the documentation , but can't see it.

+8
source share
2 answers

Atomic transactions can reduce the time spent in a loop:

 from django.db import transaction with transaction.atomic(): for i, row in df.iterrows(): mv = MeasureValue.objects.get(org=row.org, month=month) if (row.percentile is None) or np.isnan(row.percentile): # if it already None, why set it to None? row.percentile = None mv.percentile = row.percentile mv.save() 

The default behavior of Djangos should be automatic. Each request is immediately transferred to the database if the transaction is not active.

Using with transaction.atomic() , all inserts are grouped into a single transaction. The time required to commit the transaction is amortized over all attached insert operators, and therefore the time for the insert statement is significantly reduced.

+14
source

Starting with Django 2.2, you can use the bulk_update() method to effectively update the specified fields in the provided model instances, usually with a single request:

 >>> objs = [ ... Entry.objects.create(headline='Entry 1'), ... Entry.objects.create(headline='Entry 2'), ... ] >>> objs[0].headline = 'This is entry 1' >>> objs[1].headline = 'This is entry 2' >>> Entry.objects.bulk_update(objs, ['headline']) 
0
source

Source: https://habr.com/ru/post/1247541/


All Articles