Wednesday, January 2, 2013

Re: Bulk delete - performance / object collection

Hi George,

This is one area I spent quite a lot of time in personally, see;

Bulk operations using the ORM isn't always the right thing to do - and it entirely depends on what you consider bulk and acceptable performance.

You might want to look at the source code for this, to see how they handle bulk operations (they implemented the same bulk_update approach mentioned in the above threads)

Although bypassing the ORM might feel wrong at first, sometimes it is completely acceptable - you just need to make sure you don't abuse/misuse it unnecessarily.

Cal

On Wed, Jan 2, 2013 at 11:29 AM, George Lund <glund@mintel.com> wrote:
I'm trying to bulk-delete several million rows from my database.

The docs for Django 1.3 say "this will, whenever possible, be executed purely in SQL". A pure-SQL delete is what I want in this case, so that's fine.

However, the code is never getting as far as running any SQL.

Interrupting the script shows that the delete method on the QuerySet is trying to use a "Collector" to construct model instances for each row I'm trying to delete. This is going to take too long (and may in fact consume all the memory available) -- I don't think it's practical to wait in this case. (I've tried waiting over half an hour!)

(I'm looking at django.db.models.query.QuerySet.delete and django.db.models.deletion.Collector.collect / Collector.add.)

What's the point in doing the delete "purely in SQL" if all of the objects are getting constructed anyway? Why do they need to be "collected" before the SQL DELETE is run? The model instance in this case has no child rows, to which the delete might need to be cascaded.

Meanwhile I can construct the SQL by hand easily enough, but I feel this isn't doing things the right way.

thanks for any help

George

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/W4LqKzcnlaYJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

No comments:

Post a Comment