Monday, October 29, 2012

Re: Scaling django (nginx + apache + mod_wsgi + postgresql)

On Mon, Oct 29, 2012 at 2:42 PM, Isaac XXX <vyrphan@gmail.com> wrote:
> Hi folks,
>
> I'm developing a new application that should get high traffic. Right now,
> I've other projects with the follow architecture:
>
> Nginx on front: serving static content and redirecting to apache for dynamic
> data
> Apache+mod_wsgi: serving dynamic pages
> PostgreSQL: backend for data storage (RDBM)
> Memcache: for caching purposes :)
>
> All my deployments use a single server, with single frontend/backend (1
> nginx, 1 apache, 1 postgresql). The requirements for this new project are
> really large, and I think I will need to scale all system. Can anyone
> suggest me an all-in-one tutorial, discussing the main points on scale a
> system?
>
> I know there are different alternatives for DB (master-slave,
> clustering...), nginx can serve as a reverse proxy or not... and I need to
> merge all this information in a single scalable system, but I can't find an
> unified source of information.
>
> Can anyone help me on it?
>

There is unlikely to be one authoritative source that will explain
precisely how to scale an app - part of this is it is domain specific
what "scale" and "app" mean!

So first off, "scale". You can scale up, or out. Scaling up means
running everything on faster hardware. Due to how IT progresses, every
18 months you can replace your server with something twice as fast.
Scale out means running everything over more boxes. Scale up is
trivial, just spend more money, scale out can be harder.

Most parts of the stack are easy to scale, because HTTP itself is
stateless and therefore easy to scale. Eg, if you have a nginx
frontend, serving static files and proxying to backend servers for
static content, and the nginx server is overloaded, it is easy to add
a balancer to route HTTP requests to multiple nginx servers.

The same is true for dynamic content - need more workers, add more
machines, and tell nginx to talk to more machines.

The only tricky aspect of scale out is database. Most databases do not
have a simple 'scale out' option. With postgres, you can setup
master-slave trees, but this only expands your read capacity, all
writes have to go through one server (and then all the slaves, making
it more expensive the more slaves you add).

The only true way of scaling out with database servers is to shard
your data, splitting it up by some arbitrary algorithm (usually on
user), but sharding isn't easy, you will have to design your database
and app around it. There are some very good videos, docs and talks
from the likes of Facebook and the like, sharding is not a panacea and
requires a lot of work.

Cheers

Tom

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

No comments:

Post a Comment