Thursday, August 28, 2014

Re: Django ORM generate inefficient SQL

I have created a ticket - https://code.djangoproject.com/ticket/23383
Should I try to proceed my investigations and create a patch?

пятница, 29 августа 2014 г., 11:50:01 UTC+6 пользователь Simon Charette написал:
I think you should create a single issue pointing to the fact that only selected columns (either through values() or only()) should be grouped by, regardless of the backend.

I suggest you include your initial post body in the report.

Thanks!

Le vendredi 29 août 2014 01:03:03 UTC-4, Alex Lebedev a écrit :
Yes, I guess, in this case it should be able to group only by `auth_group`.`id` for each database backend.
Should I create ticket(s) about the issue on MySQL and/or the issue of columns in "group by" statement, or would you like to do it yourself?

четверг, 28 августа 2014 г., 20:36:19 UTC+6 пользователь Simon Charette написал:
The issue on MySQL should be tracked in another ticket I guess -- you're right that it should be able to GROUP only by `auth_group`.`id`.

Le jeudi 28 août 2014 07:28:18 UTC-4, Alex Lebedev a écrit :
Thanks for the answer! Yes, but this problem occurs regardless of database backend (I tested it on PostgreSQL and MySQL)

среда, 27 августа 2014 г., 12:37:00 UTC+6 пользователь Simon Charette написал:
This is already tracked in #19259.

Le jeudi 21 août 2014 06:49:41 UTC-4, Alex Lebedev a écrit :
Hi, guys!

I have encountered a problem. When I use the following code in django shell:

    from django.contrib.auth.models import Group
    from django.db.models import Count
    print Group.objects.annotate(cnt=Count('user')).values('id', 'cnt').query.sql_with_params()
 
Django ORM generate the following SQL query:

    'SELECT `auth_group`.`id`, COUNT(`auth_user_groups`.`user_id`) AS `cnt` FROM `auth_group` LEFT OUTER JOIN `auth_user_groups` ON ( `auth_group`.`id` = `auth_user_groups`.`group_id` ) GROUP BY `auth_group`.`id`, `auth_group`.`name` ORDER BY NULL'

"auth_group.name" occurs in "group by" statement. But this column isn't represented in "select" statement. Such query is inefficient (expecially for large tables with many columns and rows).

Debuging of Django SQLCompiler ( https://github.com/django/django/blob/stable/1.6.x/django/db/models/sql/compiler.py#L568 ) gives me the following information:
- postgresql:
    "self.query.select" == "self.query.group_by" == "[(u'auth_group', u'id'), (u'auth_group', 'name')]"
    "self.connection.features.allows_group_by_pk" is False
    "
len(self.query.get_meta().concrete_fields) == len(self.query.select)" is False
   
'auth_group'.'name' appears in result because of "cols = self.query.group_by + having_group_by + select_cols"
- mysql:
    "self.query.select" == "self.query.group_by" == "[(u'auth_group', u'id'), (u'auth_group', 'name')]"
    "self.connection.features.allows_group_by_pk" is True
    "
len(self.query.get_meta().concrete_fields) == len(self.query.select)" is False
   
'auth_group'.'name' appears in result because of "cols = self.query.group_by + having_group_by + select_cols"

In the same time, the following code (without .values()):

    from django.contrib.auth.models import Group
    from django.db.models import Count
    print Group.objects.annotate(cnt=Count('user')).query.sql_with_params()

gives the right SQL query for mysql (because "len(self.query.get_meta().concrete_fields) == len(self.query.select)" is True):

'SELECT `auth_group`.`id`, `auth_group`.`name`, COUNT(`auth_user_groups`.`user_id`) AS `cnt` FROM `auth_group` LEFT OUTER JOIN `auth_user_groups` ON ( `auth_group`.`id` = `auth_user_groups`.`group_id` ) GROUP BY `auth_group`.`id` ORDER BY NULL'

Is it a bug? Should I create a bug report or something?

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/0ffe1264-5170-47b3-b22f-36a27f1ca025%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment