2

We have three servers running Ubuntu Server 10.04, load balancing between them through DNS. We use Django, nginx to serve content and PostgresQL as database.

For PostgresQL, there are some mirroring solutions, but what is the best way to mirror our static files, using "three masters" schema?

I guess just rsyncing them wouldn't be a scalable and easy-to-maintain way.

3 Answers 3

3

As long as the files don't change often and must be kept synced at all times, why not rsync? Just make sure you have one master server where you edit the files, that makes syncing easier.

Other than that, a networked file system like NFS might work, or you implement something like DRBD to keep the files synced at all times.

0
2

There are lots of other solutions (afs, unionfs...), but rsync works surprisingly well for one way replication and is self-healing - and is scalable along as you have defined paths for replication (a single master is fine for up to around 5 slaves, but beyond that there's probably good reason to go to multiple tier replication).

The only issue is with timing of replication. Since you are using round-robin DNS, you already have server affinity - so you're not going to have the problem where a user updates server A then can't see the updates because he's looking at server B. But delays in propogation of code can cause some pain in deployments (particularly if you've got a code dependence on DDL changes to a common database).

If you must have bi-directional replication (try to avoid if at all possible) then yes, a realtime replication system would be more appropriate.

If you are currently running rsync manually / via cron, you might consider using inotify to run the rsync on files as they change such that the delay becomes very short.

C.

2
  • We are still thinking about overall design. Why not using bi-directional replication? Problem is, there will be some user-generated static content involved in our project. Commented Aug 12, 2010 at 14:21
  • 1
    Problem with bi-directional? Its not just 2 way, its 2^N way (where N is the number of nodes) add to that the fact that due to inevitable delays you will get conflicting updates whereas with the master/slve (uni-directional) replication, there is a definitive version of all data. Commented Aug 12, 2010 at 16:53
0

When code is deployed to production, it should be deployed to all the servers at once. If this action is properly controlled, it should be mirrored as part of your controls and a technology solution will be unnecessary. Not all administrative solutions are based in technology.

OpenEFS is a tool that was designed to enable change control as well as deployments, which you might find helpful. I implemented a lot of what they do on my own but for someone who has no foundation, it would be a good start.

For static servers that are not in scope for change control, I have found rsync to be an appropriate solution in the past. Typically for servers that fall under this category, scaling is unlikely to be an issue but if it is that's where NFS or AFS might come into play.

2
  • We do have version control system, but as I said to symcbean above, our project will have user-generated content. So besides usual code deployment we will need to sync changes from one random server across the other two. Commented Aug 12, 2010 at 14:25
  • User generated content typically belongs in a database. Storing on the filesystem is almost always ill-advised with a load balanced architecture. Commented Aug 12, 2010 at 14:42

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.