Saturday, May 30, 2020

Good usage of native JSON functions and operators in ORM queries

Hi,

I have a model MyModel which has a JSONField() called 'snapshot'. In Python terms, each snapshot looks like this:

======================
snapshot = {
    'pay_definition' : {
        '1234': {..., 'name': 'foo', ...},
        '99': {..., 'name': 'bar', ...},
}
======================

I'd like to find all unique values of 'name' in all instances of MyModel. I have this working using native JSON functions like this:

   class PayDefs(Func):  
        function='to_jsonb'  
        template="%(function)s(row_to_json(jsonb_each((%(expressions)s->'pay_definition')))->'value'->'name')"  
   
   MyModel.objects.annotate(xxx=PayDefs(F('snapshot'))).order_by().distinct('xxx')
.values_list('xxx', flat=True)

My question is if this the best way to solve this problem? The way my current logic works, reading from insider out is, I think:
  1. Pass in the 'snapshot'.
  2. Since 'snapshot' is a JSON field, "->'pay_definition'" traverses this key.
  3. To skip the unknown numeric keys, "jsonb_each()" turns each key, value pair into an inner row like ['1234', {...}].
  4. To get to the value column of the inner row "row_to_json()->'value'".
  5. To get the name field's value "->'name'".
  6. A final call to "to_jsonb" in the PayDefs class.
For example, since all I care about is the string value of 'name', is there a way to get rid of the PayDefs class, and its invocation of to_jsonb? Likewise, is there a better way to do the inner parts? To provide context on what "better" might be:
  • Snapshot JSONs might easily be 20MB in size.
  • Each 'pay_definition' is probablyonly about 1kB in size, and there might be 50 of them in a snapshot.
  • There might be 1000 MyModel instances in a given query.
  • I'm using PostgreSQL 12
so my concern is not have the database server or Django perform extraneous work converting between strings and JSON for example.

Thanks, Shaheed


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CAHAc2jc24H-CKWH-3x2cEkRKMGnAz8s_HZt8%3Dd5LUrqKb%3D0r4A%40mail.gmail.com.

No comments:

Post a Comment