Tuesday, July 25, 2017

Re: ModelForm validation of foreign keys - extra database queries and performance bottleneck


I've already tried the select_related in my queryset. No change at all.  

>>Also keep in mind that Django (or any other framework) has no idea whether or not fields have "changed" in a form submission without pulling the original set of values to compare against, so expect the object to be pulled at least once on every request. 

The first query already retrieves the primary key of the original object, or the complete object if selected_related is added.  Using select_related can keep things to a single query, and we should be good - it's the number of db queries that is executed which is drastically reducing the performance.


I believe I'll need a custom version of the ModelChoiceField class.


Op dinsdag 25 juli 2017 10:48:41 UTC+2 schreef James Schneider:


On Jul 24, 2017 4:09 AM, "johan de taeye" <johan.d...@gmail.com> wrote:

I have a model that has a foreign key relation to a number of other objects.
When saving an instance of this model from the admin (or a ModelForm), I see plenty of extra and redundant database calls. 
For a single record it wouldn't make much of a difference, but when using the same ModeForm to do some batch upload these become the bottleneck in the process.

Has anyone bumped into the same performance bottleneck?
Has anyone developed some solution for this?

By logging all database queries and some digging in the code, here's my analysis of what is happening:
  1. Open the admin editing screen for a single record.  
    I leave all fields to the original value, except for a field (not one of the foreign key fields)
  2. When saving the record, the first query reads the existing record:
          select field1, field2, field3, .... from mytable;
  3. During the form/model validation, I get an extra database query for each of the foreign key fields.
    It is generated from the to_python method of django.forms.models.ModelChoiceField:
          select field_a, field_b, field_c, field, ... from related_table where pk = 'id_from_first_query';
  4. During the form/model validation, I get another database query for each of the foreign key fields.
    It verifies that the values actually exists in the database:
         select (1) from related_table where pk = 'value from form';     
The queries in step 3 and 4 are redundant if the field hasn't changed. The first query gives enough data to allow us to verify that the new form value and the foreign key field on the existing instance are equal. I am using django 1.11.

The same queries, except 2, are executed when I create a new record. The queries in step 4 are redundant then - we just retrieved the values from the database.

Looking forward to any insights and hints...

You should look at modifying the query set that your view is using to pull the main object to include select_related() calls. I don't know if you're using function -based views or class-based views,  so I can't comment further on implementation. 


The extra calls are likely occurring when the related fields are being accessed during validation, etc. 

Also keep in mind that Django (or any other framework) has no idea whether or not fields have "changed" in a form submission without pulling the original set of values to compare against, so expect the object to be pulled at least once on every request. 

-James 

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/5cc0b017-2759-4026-ab6e-41f5a6cb3dd3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment