The MongoDB Connector for Spark provides integration between MongoDB and Apache Spark.
Note
Version 10.x of the MongoDB Spark Connector is an all-new connector based on the latest Spark API. Install and migrate to version 10.x to take advantage of new capabilities, such as tighter integration with Spark Structured Streaming.
Version 10.x uses the new namespace com.mongodb.spark.sql.connector.MongoTableProvider
. This allows you to use old versions of the connector (versions 3.x and earlier) in parallel with version 10.x.
To learn more about the new connector and its advantages, see the MongoDB announcement blog post.
With the connector, you have access to all Spark libraries for use with MongoDB datasets: Dataset
for analysis with SQL (benefiting from automatic schema inference), streaming, machine learning, and graph APIs. You can also use the connector with the Spark Shell.
The MongoDB Spark Connector is compatible with the following versions of Apache Spark and MongoDB:
MongoDB Connector for Spark | Spark Version | MongoDB Version |
---|---|---|
10.5.0 | 3.1 through 3.5 | 6.0 or later |