A fault-tolerant way to replicate an entire CouchDB cluster
$ npm install -g replicate-couchdb-cluster Usage: replicate-couchdb-cluster -s source -t target options -s source The URL for the CouchDB cluster from which we will be replicating -t target The URL for the CouchDB cluster to which we will be replicating Options: -c max-concurrency The maximum number of concurrent replications. If this value is omitted then the max-concurrency is defaulted to 20. -i dbs-to-skip A comma separated list of DBS to skip -a Use the target's _replicate API when replicating. This is particularly useful when you are trying to replicate from a remote source to localhost. By default, the source's _replicate API is used. -v Verbose -d Debug info such as details of the requests and responses. Useful for determining why long replications are failing. Examples: Replicate all DBs on example1.com to example2.com: $ replicate-couchdb-cluster -s http://example1.com:5984 -t http://example2.com:5984 Replicate all DBs, except the _users and _replicator DBs: $ replicate-couchdb-cluster -s http://example1.com:5984 -t http://example2.com:5984 \ -i _users,replicator Replicate all DBs using SSL and authentication: $ replicate-couchdb-cluster -s https://admin1:secrect1@example1.com:6984 \ -t https://admin2:secrect2@example2.com:6984 Replicate all DBs from a remote source to a local source: $ replicate-couchdb-cluster -s https://admin1:secrect1@example1.com \ -t http://localhost:5984 -a This can be useful for scheduling a reoccurring backup.
Run the replication in the foreground:
$ docker run -it \ -e SOURCE="https://admin1:secrect1@example1.com:6984" \ -e TARGET="https://admin2:secrect2@example2.com:6984" \ redgeoff/replicate-couchdb-cluster Replicate every hour in the background. This will persist through server reboots:
$ docker run -d --name replicate-couchdb-cluster \ --restart always \ -e SOURCE="https://admin1:secrect1@example1.com:6984" \ -e TARGET="https://admin2:secrect2@example2.com:6984" \ -e RUN_EVERY_SECS=3600 \ -e VERBOSE=true \ redgeoff/replicate-couchdb-cluster Notes:
- If the replication takes longer than RUN_EVERY_SECS, it will result to running the replications back to back. You can use
RUN_EVERY_SECS=0if you always want the replication to run continuously. - You can view the output at
/var/lib/docker/containers/<container id>/<container id>-json.log
Replicate every day at 23:00 UTC (11 PM). This will persist through server reboots:
$ docker run -d --name replicate-couchdb-cluster \ --restart always \ -e SOURCE="https://admin1:secrect1@example1.com:6984" \ -e TARGET="https://admin2:secrect2@example2.com:6984" \ -e RUN_AT="23:00" \ -e VERBOSE=true \ redgeoff/replicate-couchdb-cluster All options:
$ docker run -d --name replicate-couchdb-cluster \ --restart always -e SOURCE="https://admin1:secrect1@example1.com:6984" \ -e TARGET="https://admin2:secrect2@example2.com:6984" \ -e RUN_AT="HH:MM" \ -e RUN_EVERY_SECS=3600 \ -e CONCURRENCY=10 \ -e SKIP="_users,_replicator" \ -e USE_TARGET_API=1 \ -e VERBOSE=true \ -e DEBUG=true \ redgeoff/replicate-couchdb-cluster Note: The RUN_AT and RUN_EVERY_SECS options cannot be used simultaneously. RUN_AT will always take precedence over RUN_EVERY_SECS.
You can also use the API.
Example:
var replicate = require('replicate-couchdb-cluster'); replicate({ source: https://admin1:secrect1@example1.com:6984, target: https://admin2:secrect2@example2.com:6984, concurrency: 10, skip: ['_users', '_replicator'], verbose: true, useTargetAPI: true, debug: true }).then(function () { // Replication done });You may be wondering why you would need such a tool when CouchDB 2 automatically replicates data between nodes:
- We feel it is much safer to have a separate backup of our data in case something happens to our data, e.g. we accidentally delete data, there is a hacking attempt, etc...
- Sometimes you want to replicate a cluster to a different region of the world. (The built-in clustering in CouchDB 2 isn't designed to be used across different regions of the world)
- In rare cases, we have found that CouchDB sometimes corrupts database files and in these rare cases, we've had to restore data from our backups.