Saturday, February 13, 2016

Re: De Morgan's Law and QuerySet semantics

Why shouldn't test_obj be in that query result? test_obj has a m2m with int_value 10 and a m2m (although another one) with char_value bar

The assertions are just saying what the current behavior of the ORM is. Under the hood, it's because Q(m2ms__int_value=10) & Q(m2ms__char_value='bar') is equivalent to Q(m2ms__int_value=10, m2ms__char_value='bar') and a single join to the m2ms table (not including the through table) is created:
SELECT "testapp_testobj"."id"
FROM "testapp_testobj"
INNER JOIN "testapp_testobj_m2ms"
      ON ("testapp_testobj"."id" = "testapp_testobj_m2ms"."testobj_id")
INNER JOIN "testapp_m2mmodel"
      ON ("testapp_testobj_m2ms"."m2mmodel_id" = "testapp_m2mmodel"."id")
WHERE ("testapp_m2mmodel"."int_value" = 10
       AND "testapp_m2mmodel"."char_value" = bar)

The conditions in the WHERE clause must apply to the same row in the join, so it only matches when the same row in the m2m table matches both conditions simultaneously.

I think that's a reasonable way to make the ORM behave, espeically since it's been pretty long standing behavior. You certainly could give Q(foo=1, bar=2) a different meaning than Q(foo=1) & Q(bar=2), though at this point that would be a big break with pre-existing behavior. Given that behavior, it seems counterintuitive to not obey logical invariants like De Morgan's law.

i'll mention that Q object behavior is one of the trickiest bits of the ORM. Some of the relevant reading might be:

I've definitely seen that in trying to build a library to mimmick its behavior! Thanks for the links. I'll read through those pull requests and file a trac ticket if they don't clarify my question. If the determination from Anssi is that altering the current behavior is desirable/feasibl, I'll start working on a patch. Thanks very much, Tim!

- Lucas

On Saturday, February 13, 2016 at 6:53:11 PM UTC-8, Vijay Khemlani wrote:
assert test_obj not in TestObj.objects.filter(Q(m2ms__int_value=10) & Q(m2ms__char_value='bar'))

Why shouldn't test_obj be in that query result? test_obj has a m2m with int_value 10 and a m2m (although another one) with char_value bar

On Sat, Feb 13, 2016 at 7:45 PM, Lucas Wiman <lucas...@gmail.com> wrote:
I'm working on the django-predicate library, which defines in-memory evaluation semantics for Q objects. The eventual goal is to precisely match the behavior of the ORM on the subset of supported lookup types. Yesterday, I noticed a bug in that library, where there was a mismatch with the behavior of the ORM.

De Morgan's law is a sort of distributive property for Boolean algebras, stating that (A ∧ B) ⇔ ¬(¬A ∨ ¬B). It has an equivalent statement for sets and logical predicates. However, it seems Django's Q object does not obey De Morgan's law, which was somewhat surprising.

I wanted to verify that this intended behavior. If so, does anyone know where this is documented or Trac tickets that discuss how & and ~ interact when turned into SQL queries? 

If it is not intended behavior, I'll file a Trac issue about it and we can continue discussion there.

Setup of example models.py

class Base(models.Model):
    class Meta:
        abstract = True
    char_value = models.CharField(max_length=100, default='')
    int_value = models.IntegerField(default=0)
    date_value = models.DateField(default=datetime.date.today)
    datetime_value = models.DateTimeField(default=datetime.datetime.now)


class TestObj(Base):
    m2ms = models.ManyToManyField(
        'testapp.M2MModel', related_name='test_objs')

class M2MModel(Base):
    pass

Example
test_obj = TestObj.objects.create()
test_obj.m2ms.create(int_value=10, char_value='foo')
test_obj.m2ms.create(int_value=20, char_value='bar')


assert test_obj not in TestObj.objects.filter(Q(m2ms__int_value=10) & Q(m2ms__char_value='bar'))
assert test_obj in TestObj.objects.filter(~(~Q(m2ms__int_value=10) | ~Q(m2ms__char_value='bar')))

If De Morgan's law were obeyed, the two queries would evaluate to the same result.  The generated SQL is recorded in my pull request adding a test cases reproducing the failure in django-predicate: https://github.com/ptone/django-predicate/blob/30d23330e71ecc6a8f01743a43a56b406fe764ee/tests/testapp/tests.py#L215-L242

The key issue here seems to be crossing an m2m relation, which causes the negated disjuncts to get evaluated as subqueries.

Relevant Trac Issues

https://code.djangoproject.com/ticket/21956 Same issue, but when the fields occur on the same table that's being queried. This issue was apparently fixed in Django 1.5.
https://code.djangoproject.com/ticket/13099

Thanks,
Lucas

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/d2f97578-020d-4b7c-9d03-8e3079a18f2f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/0054d834-cc55-4e01-80a6-622bcfc1e23b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment