Update multiple objects in Django at once?

Question

Update multiple objects in Django at once?

I am using Django 1.9. I have a Django table that represents the value of a specific measure, by month, with the original values and percentiles:

class MeasureValue(models.Model): org = models.ForeignKey(Org, null=True, blank=True) month = models.DateField() calc_value = models.FloatField(null=True, blank=True) percentile = models.FloatField(null=True, blank=True)

Typically, 10,000 people a month. My question is whether I can speed up the process of setting values on models.

I am currently calculating percentiles, getting all measured values for a month, using the Django filter query, converting it to the pandas framework, and then using scipy rankdata to set ranks and percentiles. I do this because pandas and rankdata efficient, able to ignore null values, and able to handle duplicate values the way I want, so I'm happy with this method:

 records = MeasureValue.objects.filter(month=month).values() df = pd.DataFrame.from_records(records) // use calc_value to set percentile on each row, using scipy rankdata

However, I then need to get each percentile value from the data frame and set it again on model instances. Now I am doing this by iterating over the data rows and updating each instance:

 for i, row in df.iterrows(): mv = MeasureValue.objects.get(org=row.org, month=month) if (row.percentile is None) or np.isnan(row.percentile): row.percentile = None mv.percentile = row.percentile mv.save()

This is not surprisingly rather slow. Is there an effective way for Django to speed it up by creating a single database, rather than tens of thousands? I checked the documentation , but can't see it.

+8

python django

Richard Apr 20 '16 at 17:46

source share

2 answers

Starting with Django 2.2, you can use the bulk_update() method to effectively update the specified fields in the provided model instances, usually with a single request:

 >>> objs = [ ... Entry.objects.create(headline='Entry 1'), ... Entry.objects.create(headline='Entry 2'), ... ] >>> objs[0].headline = 'This is entry 1' >>> objs[1].headline = 'This is entry 2' >>> Entry.objects.bulk_update(objs, ['headline'])

0

Eugene yarmash Apr 22 '19 at 16:00

source share

ahmed · Accepted Answer · 2016-04-20T17:53:07+0000

Atomic transactions can reduce the time spent in a loop:

 from django.db import transaction with transaction.atomic(): for i, row in df.iterrows(): mv = MeasureValue.objects.get(org=row.org, month=month) if (row.percentile is None) or np.isnan(row.percentile): # if it already None, why set it to None? row.percentile = None mv.percentile = row.percentile mv.save()

The default behavior of Djangos should be automatic. Each request is immediately transferred to the database if the transaction is not active.

Using with transaction.atomic() , all inserts are grouped into a single transaction. The time required to commit the transaction is amortized over all attached insert operators, and therefore the time for the insert statement is significantly reduced.

Update multiple objects in Django at once?

More articles: