-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iF4EAREIAAYFAlLz6akACgkQtuvagsE+DE4AbQEA352XvZrUnr/YJn74nbw+l8zN
OGNZQDcdeEnx6aa899QA/RYqEw987zRzNKmQhcRWb+ZhbqZL+1x0RuSlQYO8SQBp
=OmwE
-----END PGP SIGNATURE-----
On Wed, 5 Feb 2014 10:11:29 -0800 (PST) ST <sej854@googlemail.com>
wrote:
> Hi,
>
> I'm trying to optimize the run-time of getting total credit and debt
> values out of our database. Ideally I'd like to formulate it as a
> Django query. This is the raw SQL query I have, which produces the
> right answer and is very fast (milliseconds):
>
> SELECT sum(tg.total) FROM
> (
> SELECT sum(t.amount) AS total, t.member_id AS member_id
> FROM club_transaction AS t
> WHERE t.member_id IS NOT NULL
> GROUP BY t.member_id
> ) AS tg
> WHERE tg.total < 0
>
> (plus a second query for > 0)
>
> My Django implementation was:
>
> m =
> Member.objects.annotate(balance=Sum('transaction__amount')) m_debt =
> m.filter(balance__lt=0).aggregate(total=Sum('balance')) m_credit =
> m.filter(balance__gt=0).aggregate(total=Sum('balance'))
>
> which looks a lot nicer, is easier to understand and maintain.
>
> However, it results in the following SQL query (slightly redacted):
>
> SELECT SUM(balance) FROM
> (
> SELECT "club_member"."id" AS "id", {all the other fields},
> SUM("club_transaction"."amount") AS "balance"
> FROM "club_member"
> LEFT OUTER JOIN "auth_user" ON ("club_member"."user_id" =
> "auth_user"."id")
> LEFT OUTER JOIN "club_transaction" ON ("club_member"."id" =
> "club_transaction"."member_id")
> GROUP BY "club_member"."id", {all the other fields},
> "auth_user"."last_name", "auth_user"."first_name"
> HAVING SUM("club_transaction"."amount") < 0
> ORDER BY "auth_user"."last_name" ASC, "auth_user"."first_name" ASC
> ) subquery
>
> (again, plus another one for > 0)
> which is very slow (almost 1.5 seconds).
>
> How can I construct a Django query which doesn't request (and group
> by) all the unnecessary other fields ?
> I already tried playing around with only() and values() but never got
> it to work.
I did something similar a few years back. Its next to impossible with
django-orm, just do it in raw sql.
The django-orm can't really do any advanced group_by clauses. And you
can't give them with extra(). Well, you can give them but they won't be
used;-)
Doing two (or three) orm-queries and then joining the data in python
will actually be slower then doing it all in hand-crafted sql.
So just do the sql by hand. And then optimize by having three columns,
one with the SUM(amount) if amount>0, one with the SUM(amount) if
amount<0 and one with the SUM(amount). Total credits, total depts and
balance all in one query (if possible)...
- Arnold
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment