Overview
New in version 2.3:
By default, mongosqld
samples each collection on the connected MongoDB instance and generates a relational representation of the schema which it then caches in memory.
Note
If you have authentication
enabled, ensure that your MongoDB user has the correct permissions. See User Permissions below.
By default, mongosqld
does not automatically resample data after generating the schema. Specify the --schemaRefreshIntervalSecs
option to direct mongosqld
to automatically resample the data and regenerate the schema on a fixed schedule.
If the schema which mongosqld
creates doesn't meet your BI workload needs, you can manually generate a schema file file and edit it as necessary.
To learn more about sampling modes, see the Sampling Mode Reference Chart.
User Permissions for Cached Sampling
If your MongoDB instance uses authentication and you wish to use cached sampling, your BI Connector instance must also use authentication. The admin user that connects to MongoDB via the mongosqld
program must have permission to read from all the namespaces from which you want to sample data.
Sample All Namespaces
If you wish to sample all namespaces, the admin user requires the following privileges:
listDatabases
on the clusterlistCollections
on each databasefind
on each database
Alternatively, create a user with the built-in readAnyDatabase role:
use admin db.createUser( { user: "<username>", pwd: "<password>", roles: [ { "role": "readAnyDatabase", "db": "admin" } ] } )
Note
Be aware of all privileges included with the readAnyDatabase role before granting it to a user.
To sample all namespaces, start mongosqld
without the --sampleNamespaces
option.
mongosqld --auth --mongo-username <username> --mongo-password <password>
Sample Specific Namespaces
If you wish to sample specific namespaces, the admin user requires the following privileges:
listCollections
for each database where all collections are sampledfind
on each collection or each database where all collections are sampled
Alternatively, create a user with the built-in readAnyDatabase role. For an example of creating a user with this role, see the Sample All Namespaces section.
Note
Be aware of all privileges included with the readAnyDatabase role before granting it to a user.
The following example creates a custom role in the mongo shell with the minimum required privileges to sample every collection in the test
database:
Create a new user and assign the newly created role to them
db.createUser( { user: "<username>", pwd: "<password>", roles: [ "samplingReader" ] } )
Note
The user in the example above does not have the listDatabases
privilege, so you must specify a database to sample data from with the --sampleNamespaces
option when running mongosqld
.
Start mongosqld
with authentication enabled
Run mongosqld
with authentication enabled and use the --sampleNamespaces
option to sample data from all collections in the test
database:
mongosqld --auth --mongo-username <username> --mongo-password <password> \ --sampleNamespaces 'test.*'