New in version 1.9.
mongosync
includes an embedded verifier to perform a series of checks on the destination cluster to verify the sync of supported collections. mongosync
enables the verifier by default on replica set clusters.
Starting in version 1.10, mongosync
enables the verifier by default on sharded clusters.
Note
mongosync
reads using primary
read preference, so it preserves document field order from the source cluster's primary node. The embedded verifier also checks documents based on the source cluster’s primary node, but at a different time from when mongosync
reads them. Because of this, in rare cases, discrepancies in document field order between the source cluster’s nodes can cause the embedded verifier to fail the migration, even if mongosync
copied the documents correctly.
About this Task
Compatibility
The embedded verifier is not available in mongosync 1.8 and earlier.
For alternative verification methods, see Verify Data Transfer.
Limitations
The embedded verifier has the following limitations:
mongosync
stores the verifier state in memory, which can result in a significant memory overhead. To run the verifier,mongosync
requires approximately 10 GB of memory, plus an additional 500 MB for every 1 million documents.The verifier cannot be resumed. If a user stops or pauses sync and then starts
mongosync
again for any reason, the verification process restarts from the beginning. This can cause verification to fall substantially behind the migration.When migrating from a replica set to a sharded cluster, you cannot rename source collections that you specify in the sharding options. If you rename a collection included in the sharding options during the CEA phase, the verifier reports a sharding mismatch.
If you start sync with verification enabled and
buildIndexes
set tonever
, the migration will fail ifmongosync
finds a TTL collection on the source cluster. This can happen after you call the/start
endpoint or much later, such as where a user creates a TTL index on the source cluster while a migration is in progress.To sync TTL collections without building indexes on the destination cluster, you must start sync with the verifier disabled.
Unsupported Verification Checks
The verifier doesn't check the following namespaces:
Capped collections
Collections with TTL indexes, including TTL indexes that are added or dropped during migration
Collections that don't use the default collation
To verify unsupported collections, add additional script code to examine the collections. For more information, see Verify Data Transfer.
Note
Starting in version 1.10, the verifier checks for data inconsistencies from a DDL event that occurred on the pre-6.0 source cluster during migration. This is because pre-6.0 migrations do not support DDL events.
To learn more, see Pre-6.0 Migration Limitations.
Steps
Initialize mongosync
Initialize the mongosync
process:
./bin/mongosync \ --logPath /var/log/mongosync \ --cluster0 "mongodb://clusterAdmin:superSecret@clusterOne01.fancyCorp.com:20020,clusterOne02.fancyCorp.com:20020,clusterOne03.fancyCorp.com:20020" \ --cluster1 "mongodb://clusterAdmin:superSecret@clusterTwo01.fancyCorp.com:20020,clusterTwo02.fancyCorp.com:20020,clusterTwo03.fancyCorp.com:20020"
Start the Sync
To start syncing data from the source cluster to the destination, use the /start endpoint.
curl localhost:27182/api/v1/start -XPOST \ --data ' { "source": "cluster0", "destination": "cluster1", } '
Example output:
{"success":true}
Examine Progress
To examine the status of the sync, use the /progress endpoint:
curl localhost:27182/api/v1/progress -XGET
Example output:
{ "progress": { "state":"RUNNING", "canCommit":true, "canWrite":false, "info":"change event application", "lagTimeSeconds":0, "collectionCopy": { "estimatedTotalBytes":694, "estimatedCopiedBytes":694 }, "directionMapping": { "Source":"cluster0: localhost:27017", "Destination":"cluster1: localhost:27018" }, "source": { "pingLatencyMs":250 }, "destination": { "pingLatencyMs":-1 }, "verification": { "source": { "estimatedDocumentCount": 42, "hashedDocumentCount": 42, "lagTimeSeconds": 2, "totalCollectionCount": 42, "scannedCollectionCount": 10, "phase": "stream hashing" }, "destination": { "estimatedDocumentCount": 42, "hashedDocumentCount": 42, "lagTimeSeconds": 2, "totalCollectionCount": 42, "scannedCollectionCount": 10, "phase": "stream hashing" } } }, "success": true }
Examine the verification
response field for information on the status of the embedded verifier.
Behavior
Verification Checks
The embedded verifier performs a series of checks on the destination cluster. It checks all supported collections to confirm that mongosync
was successful in transferring documents from the source cluster to the destination.
If the verifier encounters errors, it fails the migration with an error. If the verifier finds no errors, the /progress
endpoint returns canWrite: true
. To learn more about the canWrite
field, see canWrite and COMMITTED.
Starting in version 1.15, the embedded verifier examines collection metadata, indexes, and views. If the verifier finds a mismatch during metadata verification, it returns an error that contains the mismatch types and a count of their occurrences.
Please contact support to investigate verification issues.
Memory Requirements
Verification requires 10 GB of memory plus an additional 500 MB for every 1 million documents in the migration.
If the available memory is insufficient, the /start
endpoint returns an error. If this occurs, to use mongosync
with the verifier you must first increase the memory of the server and resume the migration.
If increasing server memory isn't an option, restart mongosync
with the verifier disabled.