Saturday, February 6, 2016

Re: Scaling Django

Hi Sergiy, are you referring to my post or to the OP?

On Sunday, February 7, 2016 at 6:03:11 AM UTC+8, Sergiy Khohlov wrote:
Print database structure.
Check possibility of DB normalization.

You might have meant "denormalization" here (?), especially when operating at such scale. We do used denormalization for some of our larger tables.
 
100 GB  (my "record" is 452 GB  )is not so high but  this size requires some attention. (Look like you Mysql used only one db file: try to set table per file.  Check index size , and verify that indexes  are working corectly)

We are using innodb_file_per_table. But see that I mentioned that all this 100GB data fit on a lowly 8GB ram VM, 50% of which was allocated to innodb buffers. With such little resources, but at the same time intimately knowing your database workload, it is still possible to handle such db size. And yes, our indexes are used well, as most queries were EXPLAINed and optimized accordingly. 

What hardware are you running your 452GB db in?

Review  your project:
 try to avoid  Many to Many field
Is it possible switch from hardcode SQL to  stored function and procedure ?

See my above post about denormalization. And arguably storedprocs are even harder to manage, code-wise, and deployment wise.
 
 Look like  this issue in not connected to django only.

Again, if you are referring to my post, I am not the OP. Not that our system is perfect, and yes we're not the ones with scaling problems. 
I was in fact sharing the practices of scaling that worked for us. See the OPs post on what problems they're facing (organizational / political / methodological).

Cheers!
 


Many thanks,

Serge


+380 636150445
skype: skhohlov

On Sat, Feb 6, 2016 at 7:09 PM, Dexter T. <dext...@gmail.com> wrote:
Lots of great replies already.
I also want to add a few (random) things ...

- have you clearly defined and isolated what issue(s) are you facing?
- you mentioned using DRF in a service, with a large JSON reponse taking seconds to finish, how did you troubleshoot/profile this? Seconds to process server-side? Seconds to download client-side? Where specifically? If you said you don't know, then find out!
- your system will have so many legs, have you made an effort to instrument and measure and isolate which parts are slow and why?
- you mentioned using the debug toolbar, have you proven that your database schema is optimal? Any queries in your slow queries log? Indexes used and ok and optimal? For your workload, can_read caching help? Db replicas be of help?
- how are your server resources utilized? Are you sure you are not bottlenecked by thrashing disk-io? Overcomitted CPU? Low memory/swapping? File descriptor count?
- have you checked if clients are not bottlenecked? An ajax call to download a  complex nested json object is both costly to serialize, CPU and bandwidth wise. Gzip can help here, if applicable.
- for more context, can you share some numbers, like http and db level req/sec, row count for the most heavily used tables? How about server infrastructure specs?

Note that these are basic questions and are basic problem-solving steps, im assuming your teams should be aware and be taking steps like these already.

In one project of mine, we're doing a 100gb mysql db, some tables above 100mil recs and growing rapidly, properly indexed and optimized, it works ok on a lowly single vps instance with 8gb ram; workload is clearly oltp, we're throwing more sustained writes (100s/sec) than reads, all queries were scrutinized, almost all using the ORM, some handwritten SQL, other complex queries rewritten to be done at application level, joins are harder at this scale and therefore preferrably avoided (major architectural decision anticipated). But still we can easily throw hardware if needed.

For us, scaling is an continuous commitment to measure and refactor.

And one very important learning for me in my years of writing software: rewriting is very very very very costly.

These new engineers/other colleagues coming in, are they familiar with the domain problem, the exisiting codebase, the scale at which you operate now and expected in the future? Are they experienced in doing similar scaling before? And even if you think you can throw your old work, and now that you guys think you know better, be very careful of The-Second-System-Effect.

I hope you succeed.


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/ddc6db79-af4c-4e78-a16f-84f2dc8b69ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscribe@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/2a7c4406-503d-4b4a-9e94-d9cca96d0d04%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment