Saturday, November 2, 2013

Re: Complex query reduction

On Fri, Nov 1, 2013, Javier Guerra Giraldez <javier@guerrag.com> wrote:

>have you tried eliminating the second IN relationship? something like
>
>entities = entity.get_descendants()
>
>items = BibliographicRecord.objects.filter
>(authored__researcher__person__member_of__entity__in=entities).distinct()

Indeed I have, but in that form it takes around 1770ms, compared to around 1540ms in the original form. What I actually do is:

# breaking apart the queries allows the use of values_lists
entities = self.entity.get_descendants(
include_self=True
).values_list('id', flat=True)

# and the set() here is about 230ms faster than putting a distinct() on
# the first query
researchers = set(Researcher.objects.filter(
person__entities__in=entities
).values_list('person', flat=True))

self.items = BibliographicRecord.objects.listable_objects().filter(
authored__researcher__in=researchers,
).distinct()

I think that's partly because this way the SELECT doesn't have to grab all the fields of publications_bibliographicrecord.

But, the real killer is the combination of ordering (in the queryset or on the model, it doesn't matter) with the distinct() - as soon as one is removed from the equation, the execution time drops to around 250ms.

That's for 55000 BibliographicRecords created by that last operation (before distinct() is applied; distinct() reduces them to 28000).

That seems excessive to me.

BibliographicRecord has a custom primary key, and its id fields look like "d9ce7e2f-663e-4fc6-8448-b214c6915aed:web-of-science". Could that be implicated in performance?

Daniele

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/20131102145042.868061817%40smtp.modern-world.net.
For more options, visit https://groups.google.com/groups/opt_out.

No comments:

Post a Comment