Tuesday, June 21, 2011

Re: QuerySet .annotate().extra() issues

BIG FAT DISCLAIMER
I'm not sure this has anything to do with your issue, but it's easy to miss when reading the docs.


https://docs.djangoproject.com/en/dev/topics/db/aggregation/#interaction-with-default-ordering-or-order-by

If your model has an 'ordering' attribute defined in Meta, make sure you add a "blank" .order_by()  to your query.


Sincerely,
André Terra


On Tue, Jun 21, 2011 at 10:58 AM, Michał Sawicz <michal@sawicz.net> wrote:
Hi all,

I'm not sure it's actually supposed to work, but looks like a common
enough approach that it should.

Say you have an aggregate query:
> Model.objects.values('field').annotate(Sum('count'))
that results in an aggregate query that follows:
> SELECT "models_model"."field", SUM("models_model"."count") AS
> "count__sum" FROM "models_model" GROUP BY "models_model"."field"

And that's all fine, but if you want to get additional fields:
> Model.objects.values('field').annotate(Sum('count')).values('field',
> 'field2', 'count_sum')
the additional fields get added to the GROUP BY clause:
> SELECT "models_model"."field", "models_model"."field2",
> SUM("models_model"."count") AS "count__sum" FROM "models_model" GROUP
> BY "models_model"."field", "models_model"."field",
> "models_model"."field2"

That doesn't, in itself, pose any problems. "field" is added twice, but
that's probably optimized down within SQL itself. I'm only afraid that
grouping by "field" and "field2" is slower than grouping by "field" is.
Maybe my approach here is wrong? What would you do to group by "field",
annotate with SUM("count") and retrieve "field2", too?

Now, if you want to leverage some more advanced queries:
> Model.objects.values('field').annotate(Sum('count')).values('field',
> 'field2', 'count_sum').extra({'value': 'SUM("models_model"."field3" *
> "models_model"."count")'})
that breaks with:
> DatabaseError: aggregate functions are not allowed in the GROUP BY
> clause
because the resulting query looks like this:
> SELECT (SUM("models_model"."field3" * "models_model"."count")) AS
> "value", "models_model"."field", "models_model"."field2",
> SUM("models_model"."count") AS "count__sum" FROM "models_model" GROUP
> BY "models_model"."field", "models_model"."field",
> "models_model"."field2", (SUM("models_model"."field3" *
> "models_model"."count"))
As you can see, the field definitions from extra() gets added to the
GROUP BY clause, and fails.
If I drop the last added GROUP BY element (or even all of them, bar
"field"), the query is working fine, and the results are as expected.

Is this an eccentric enough usecase that I should forget about that? Or
is my approach simply wrong?

Best regards,
--
Michał (Saviq) Sawicz <michal@sawicz.net>

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

No comments:

Post a Comment