Django talk: Re: Debugging DJango app on production for High CPU Usage

Wednesday, March 2, 2016

Re: Debugging DJango app on production for High CPU Usage

Hi James,

Thanks for the detailed explanation. Certainly helps and I would embed logging to debug the CPU usage.

Please find my comments inline:

On Monday, February 29, 2016 at 2:45:41 PM UTC+5:30, James Schneider wrote:

On Tue, Feb 23, 2016 at 8:59 PM, Web Architect <pina...@gmail.com> wrote:
Hi,

We have an ecommerce platform based on Django. We are using uwsgi to run the app. The issue the CPU usage is hitting the roof (sometimes going beyond 100%) for some scenarios. I would like to debug the platform on Production to see where the CPU consumption is happening. We have used Cache all over the place (including templates) as well - hence, the DB queries would be quite limited.

Have you validated that your cache is actually being used, and not just populated? I've seen that before.

Cache is being and has been validated. But one thing I have observed is - while I was storing ORM objects (DB results) in cache to avoid DB queries, it proved to be expensive due to the object related operations (manipulating and copying the objects in cache). We are using Redis and Redis only handles strings I think. Hence, I reverted back to DB.

I am personally in favour of async based frameworks like Tornado - in fact have used it for a high capacity Pinterest like platform where the performance has been excellent. But Tornado is quite lightweight and lot of services need to be built by us - hence, chose Django for ecommerce. This was the first experience with a complex service (ecommerce) on a platform like Django. Since Django is Sync, I was wondering if the threads getting stuck waiting for DB responses.

I would refrain from using Django-debug toolbar as it slows down the platform further, increases the CPU usage and also need to turn the DEBUG on. Is there any other tool or way to debug the platform? Would appreciate any recommendations/suggestions.

Have you looked into profiling the code or adding logging statements throughout your code to determine when/where particular segments are being run? I would definitely start with logging. I'm assuming you have suspicions on where your pain points might be:

https://docs.djangoproject.com/en/1.9/topics/logging/

I would put them in places that may be part of large loops (in terms of number of objects queried or depth of relationships traversed), or sprinkled within complex views. You have to start narrowing down which page/pages are causing your angst.

Also, does the Django ORM increase the CPU usage? Does it block the CPU? Would appreciate if anyone could throw some light on this.

I'm not sure about blocking, but if deployed correctly, the ORM should have a negligible (and acceptable) hit to the CPU in most cases, if you notice one at all. I've seen spikes from bad M2M relationships where prefetch_related() was needed (>200 queries down to 3 with prefetch_related, and ~1-2s total response down to <80ms if I recall correctly). The most common case I run into is as part of nested {% for %} loops within a template that dig down through relationships.

We have 'for loops' in templates where DB queries are being made. Would look into those.

I would also consider increasing the logging levels of your cache and DB to see if you are getting repetitive queries. The ORM does cause those from time to time since it has non-intuitive behavior in some edge cases. You can try that during low activity periods to keep the extra logging from overwhelming the system. Sometimes you can still catch the issue with a single end-user for something like repetitive/multiple queries, and are actually much easier to diagnose on a low usage server.

Do you have any other jobs that run against the system (session cleanup, expired inventory removal, mass mailing, etc.)? Would it be possible for those to be the culprit?

We do not have any other bulk tasks right now. If possible, we try to do those separately with crons.

Have you figured out any reproducible trigger?

I have done some load testing with locust.io and know there are few views which are the culprits - specifically the ones where we show bunch of products. But wanted to make sure if Django is not a bottleneck.

-James

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/dd47a604-26ab-46ed-83f7-3328f6d4fc8f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Django talk

Wednesday, March 2, 2016

Re: Debugging DJango app on production for High CPU Usage

No comments:

Post a Comment

Followers

Blog Archive

About Me