On Friday, March 2, 2012 at 11:22:52 AM UTC-8, akaariai wrote:
On Friday, March 2, 2012 7:24:00 PM UTC+2, Tom Evans wrote:Hi allI have a particular query that requires me to use a RawQuerySet - I
need to left join the table to itself in order to select the highest
'priority' row from the table for each distinct value of a foreign key
on the model, and I need to join to the related table in order to
filter the results.Having generated the queryset, I then want to make a dictionary of {
foreign_key_object : model_object }. So I have this working, but I
cannot use select_related() with a RawQuerySet, and so this runs N+1
queries, where N is the number of distinct foreign keys.Here is some code, which might explain it better:
connection.queries=[]
base_products_qs = Product.objects.raw(
"""
SELECT idp_product.*, idp_productclass.*
FROM idp_product
JOIN idp_productclass
ON idp_product.product_class_id = idp_productclass.id
LEFT JOIN idp_product p2
ON idp_product.product_class_id = p2.product_class_id
AND p2.class_priority < idp_product.class_priority
WHERE p2.id IS NULL and idp_productclass.product_type != 4
""")
base_products = dict([ (p.product_class, p) for p in base_products_qs ])
len(connection.queries) # 7 queries (6 product classes)Is there any simple way around this? I can reduce it to two queries
already, but it seems wrong to select out the info I want, throw it
away, and then fetch it again.Some suggestions:
- If the 2-query version is fast enough, use it. Premature optimization and all that... :)
- You could of course manually iterate through the SQL and instantiate the models by hand.
- Beware of the left join, it seems if you have a lot of rows with different class_priorities for each product_class_id you might be in trouble. The filtering on IS NULL happens after doing the join, so before the filter you could have a lot of intermediate rows. It might be there is no problem depending of the amount of rows and SQL vendor. Hard to know without testing. There are a couple of other ways to do the query, either by subquery doing a select min(class_priority), product_class_id or if you are using PostgreSQL, distinct on would do the job for you, too.
All that being said, I would really like the ability to have select_related available for RawQuerySet. It might be judged to be a feature not needed commonly enough, so it is possible it would not get in even with a good quality patch. However, if it had a nice API, and the implementation reused the code of QuerySet select_related, I think it could have a chance to get in.
- Anssi
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/0c29de0a-7bbb-41f6-b6cd-f06dae3545c1%40googlegroups.com.
No comments:
Post a Comment