6

My problem is regarding a master-master (3 master nodes) file synchronization setup, where each node is in a different DataCenter. I have three application servers where users can create/modify/delete files and I need to keep them in sync, hopefully with low latency between the sync (2 min is acceptable, real-time is ideal). We have a total of 376,136 files for a total of 100GB modifying (Create + Delete + Modified) at most 1,000 files a day. It's a fair assumption that a file won't be modified on two different servers at the same time.

I have googled a lot on the last week on this issue and I'm yet to find a "THIS IS IT!" solution.

The options I have seen are:

  • Unison: Abandonware (My sysadmin claims it isn't reliable)
  • Rsync: Doesn't work with delete and it's not meant to be bidirectional
  • Osync: It could be, but it seems it may be hindered by a large file tree
  • lsyncd : From their GitHub page it seems the best option so far.
  • Minio (using a aw3 file storage way): It's not designed for a master-master setup, but for a distributed storage solution
  • Cloud Storage: It would be ideal, but there isn't a good cloud provider in our Country and international internet speeds sucks here so off-country storage doesn't work for us
  • GlusterFS / Ceph / DRBD: Black magic hard to configure, maintain, control and debug, and not really suited for sync between DataCenters (From my experience, additional insights would be welcome)
  • Mirror: It seems like it is a nice option, but seems to be designed for intranet and small files.

We work with dockers, but I haven't found a docker volume plugin either that would solve this.

Anyone facing/solving this issue? Which tool is better? Is there any other tool that would be better suited for this problem?

2
  • The first result on Google for "docker distributed file system" led to github.com/moosefs/moosefs . I have no experience with it, let me know if it's any good :) Commented Jul 31, 2018 at 17:33
  • 1
    Looks like we're researching almost exactly the same thing, went down all the same ratholes and while I'd like a master-master solution, my use case allows me to go with lsyncd as well, I'm not quite sure how split-brain recovery works there though. All the distributed filesystems do look incredibly complex to set up and I haven't seen a true master-master scenario there either anyway. Commented Aug 12, 2018 at 15:34

1 Answer 1

2

I'll go with GlusterFS (which is not so difficult to setup), but you can also try with CSYNC2:

https://github.com/LINBIT/csync2

I've used to replicate a set of file over a cluster with nice results.

1
  • I will give it a try and see how well it handles the multi datacenter system, thanks! Commented Aug 1, 2018 at 20:20

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.