Thursday, February 26, 2015

Re: Can the new `Prefetch` solve my problem?

Heh, I just realized that aRkadeFR had replied with a similar idea to use a property. At least I know I'm not too far off on my thinking. :-D

-James

On Thu, Feb 26, 2015 at 11:02 AM, James Schneider <jrschneider83@gmail.com> wrote:
Whoops, accidentally sent that last one too early, here's the continuation:

However, that probably doesn't buy you much since you are still doing an extra query for every Desk you pull from your original query.

Funny enough, I was googling around for an answer here, and stumbled across this:


which I think is what you were referring to initially in your OP. I wasn't even aware of its existence. Prefetch() is a helper class for prefetch_related(). Taking a quick glance through the source code, I would imagine that it probably won't help you much, since the functionality of that class only controls the action of prefetch_related().


Zooming out a bit, the crux of your problem is this: An attribute you wish to populate is not an FK or M2M field, it is an entirely separate Queryset with some moderately complex filters. The built-in ORM functionality for pre-loading via prefetch/select_related() is expecting a FK or M2M relationship and can't  use another queryset AFAIK. The high-level functionality of prefetch_related() is probably close to what you want, which is to run a single second query to collect all of the favorite_or_nearby_chairs for all of the Desks in your original query, and then glue everything together behind the scenes in Python to make the desk_obj.favorite_or_nearby_chairs available seamlessly.

I would then wonder if there is another way to organize this data to make it easier to work with? How about adding a 'favorite_chairs' field to the Desk model that has an M2M to Chair? I would also update your 'nearby_desks' model field to use 'nearby_chairs' as the related_field. 

Then you could do something like the following:

desks = Desk.objects.filter(<filter here>).select_related('nearby_chairs', 'favorite_chairs')

Then, you can modify your model with a property that will return the concatenation of nearby_chairs and favorite_chairs:

class Desk(models.Model):
    @property
    def favorite_or_nearby_chairs(self):
        return self.nearby_chairs.all() + self.favorite_chairs

I don't believe this will spawn another query, since select_related() will have already run and have the results cached. You may also want to consider moving 'nearby_desks' out of Chair and renaming it to 'nearby_chairs' in Desk, and using a related_name of 'nearby_desks' instead. Then you can remove the .all() from the property definition from above and it definitely won't spawn a query. Obviously you'll need to create other processes that will populate desk.favorite_chairs, which may or may not be feasible.

TL;DR; I don't believe you can pre-fetch anything because of the extra SQL logic needed to calculate the favorite_or_nearby_chairs attribute. It might be possible via raw SQL though. Reformatting your data models may lead to an easier time since you can then take advantage of the some of the optimizations Django offers.

I'm slightly out in right field on this one, so YMMV, but taking a hard look at the current model design would be where I would start to try and eliminate the need for that custom queryset.

Again, the django-debug-toolbar is your friend in these cases, but obviously a high number of even relatively fast queries can have a detrimental effect on your load times. Also ensure that the fields you are using to filter contain indexes, if appropriate/available.

-James




On Thu, Feb 26, 2015 at 10:17 AM, James Schneider <jrschneider83@gmail.com> wrote:
Yep, looks like I misunderstood.

So, you want to have something like this pseudo code:

desks = Desk.objects.filter(<some filter>)
for desk in desks:
    print desk.favorite_or_nearby_chairs

And have favorite_or_nearby_chairs be pre-populated with your Chair queryset mentioned earlier?

Due to the nature of the custom Chair queryset, I doubt you can do any sort of pre-fetching that would reduce the number of queries. You can probably simulate the effect of prefetch_related() though by overriding the __init__() method of your Desk model, and having the Desk model populate favorite_or_nearby_chairs whenever a Desk object is created. 

class Desk(models.Model)
    def __init__(self):
        # using list() to force the queryset to be evaluated
        self.favorite_or_nearby_chairs = list(<custom Chair queryset>)


However, that probably doesn't buy you much since you are still doing an extra query for every Desk you pull from your original query.

Funny enough, I was googling around for an answer here, and stumbled across this:




On Thu, Feb 26, 2015 at 4:19 AM, aRkadeFR <contact@arkade.info> wrote:
got it, so you want to prefetch but not all chairs.

I will def follow this thread to see the possibilities of Prefetch :)


On 02/26/2015 12:52 PM, Ram Rachum wrote:
There may be a big number of chairs, and I don't want all the chairs prefetched. I want to have the database filter them according to the queryset I specified in a single call, I don't want to filter them in Python or make a new call to filter them.

Thanks,
Ram.

On Thu, Feb 26, 2015 at 1:48 PM, aRkadeFR <contact@arkade.info> wrote:
I may not have completely understand your problem, but
why not prefetching all the chairs? and then with the (new)
attribute favorite_or_nearby_chairs loading only the favorite
or nearby one?

like:
@property
def favorite_or_nearby_chairs(self):
    for chair in self.chair_set.all():
          #filter...
          ans += ...
    return ans

It will only hit the DB once thanks to the first join of desk
<-> chair.


On 02/26/2015 11:28 AM, cool-RR wrote:
James, you misunderstood me.

There isn't supposed to be a `favorite_or_nearby_chairs` attribute. That's the new attribute I want the prefetching to add to the `Desk` queryset that I need. Also, I don't understand why you'd tell me to add a `.select_related('nearby_desks')` to my query. Are you talking about the query that starts with `Chair.objects`? I'm not looking to get a `Chair` queryset. I'm looking to get a `Desk` queryset, which has a prefetched attribute `favorite_or_nearby_chairs` which contains the `Chair` queryset I wrote down.


Thanks,
Ram.

On Thursday, February 26, 2015 at 6:02:15 AM UTC+2, James Schneider wrote:

Well, the Desk model you provided is blank, but I'll believe you that there's a favorite_or_nearby_chairs attribute. ;-)

Should be relatively simple. Just add a .select_related('nearby_desks') to your existing query and that should pull in the associated Desk object in a single query. You can also substitute in prefetch_related(), although you'll still have two queries at that point.

If you are trying to profile your site, I would recommend the Django-debug-toolbar. That should tell you whether or not that query set is the culprit.

-James

On Feb 25, 2015 1:28 PM, "Ram Rachum" <r...@rachum.com> wrote:
Hi James,

I've read the docs but I still couldn't figure it out. My queryset works great in production, I'm trying to optimize it because our pageloads are too slow. I know how to use querysets in Django pretty well, I just don't know how to use `Prefetch`. 

Can you give me the solution for the simplified example I gave? This might help me figure out what I'm not understanding. One thing that might be unclear with the example I gave, is that I meant I want to get a queryset for `Desk` where every desk has an attribute names `favorite_or_nearby_chairs` which contains the queryset of chairs that I desrcibed, prefetched.


Thanks,
Ram.

On Wed, Feb 25, 2015 at 11:18 PM, James Schneider <jrschn...@gmail.com> wrote:
I assume that you are talking about the select_related() and prefetch_related() queryset methods?

https://docs.djangoproject.com/en/1.7/ref/models/querysets/#select-related

Both of those sections have excellent examples, and detail what the differences are (primarily joins vs. separate queries, respectively).

For better help, you'll need to go into more detail about the queries you are trying to make, what you've tried (with code examples if possible), and the results/errors you are seeing.

In general, I would try to get an initial queryset working and gathering the correct results first before looking at optimizations such as select_related(). Any sort of pre-fetching will only confuse the situation if the base queryset is incorrect.

-James

On Wed, Feb 25, 2015 at 12:05 PM, cool-RR <ram.r...@gmail.com> wrote:
Hi guys,

I'm trying to solve a problem using the new `Prefetch` but I can't figure out how to use it. 

I have these models:

    class Desk(django.db.models.Model):
        pass
    
    class Chair(django.db.models.Model):
        desk = django.db.models.Foreignkey('Desk', related_name='chair',)
        nearby_desks = django.db.models.ManyToManyField(
            'Desk',
            blank=True,
        )

I want to get a queryset for `Desk`, but it should also include a prefetched attribute `favorite_or_nearby_chairs`, whose value should be equal to: 

    Chair.objects.filter(
        (django.db.models.Q(nearby_desks=desk) | django.db.models.Q(desk=desk)),
        some_other_lookup=whatever,
    )

Is this possible with `Prefetch`? I couldn't figure out how to use the arguments.


Thanks,
Ram.
--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/46d9fdb7-c008-4496-acda-ac7cb30b4a89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Django users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-users/EuPduHjSNos/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CA%2Be%2BciVk7_6VBDoBE-qjLBwrBxiNeVdP6-fwwnOXV%3DvSA3HnCw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/fc6b1237-7bd0-44a7-a91e-c12301fe0e05%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Django users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-users/EuPduHjSNos/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/54EF0808.9090009%40arkade.info.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CANXboVa6%2BtSa5gmEww2-gm3SGHvGgn2VjOBTbYQA%3DWTt6CaufA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/54EF0F3A.8000703%40arkade.info.

For more options, visit https://groups.google.com/d/optout.



--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CA%2Be%2BciU2Tnn6oX42%2Bn6Wpg8GRErcaoG9jp_2XCvYmUxDZaKRdw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment