Configure Auto-Scaling

Atlas uses auto-scaling for cluster compute to help you optimize resource utilization and cost by adjusting your cluster tier. Atlas uses storage auto-scaling to help you automatically increase your cluster storage capacity.

In this section, you can learn about:

Cluster Tier Auto-Scaling

Atlas uses reactive and predictive auto-scaling for cluster tiers. Atlas chooses its auto-scaling mechanism based on your cluster's type, tier, and workload pattern.

Reactive auto-scaling. Atlas uses thresholds, and not prediction, to trigger scaling events based on current resource usage. Reactive auto-scaling occurs after sustained high or low resource usage. To learn more, see Reactive Auto-Scaling for Cluster Tier.
Predictive auto-scaling. Atlas uses machine learning to anticipate future scaling needs based on historical usage patterns and attempts to trigger scaling events before the forecasted workload spike arrives.
Predictive auto-scaling is an extension of cluster tier auto-scaling and falls back to reactive auto-scaling. Atlas continues to rely on reactive auto-scaling to manage unexpected spikes in workload that aren't cyclical or predictable. Atlas uses predictive auto-scaling for eligible clusters. To learn more, see Predictive Auto-Scaling for Cluster Tier.

Important

If you create a cluster in Atlas and it's eligible for reactive auto-scaling and eligible for predictive auto-scaling, both predictive and reactive auto-scaling mechanisms are enabled by default for the new cluster if you use the Atlas UI. Atlas then uses auto-scaling mechanisms based on your cluster's type, tier, and workload. If you use the Atlas Administration API, you must explicitly enable auto-scaling.

Reactive Auto-Scaling for Cluster Tier

Note

Auto-Scaling Term Usage

In all of Atlas documentation, whenever the auto-scaling term is used without the word "predictive", it refers to the reactive auto-scaling mechanism. See also predictive auto-scaling.

You can configure the cluster tier ranges that Atlas uses to automatically scale your cluster tier, storage capacity, or both in response to cluster usage.

To optimize resource utilization and improve cost profile, Atlas reactive auto-scaling detects sustained higher demand and short-term peak traffic and adjusts cluster tier based on real-time resource usage.

To help control costs, you can specify a range of maximum and minimum cluster sizes that your cluster can automatically scale to.

Reactive auto-scaling works on a rolling basis, and the process doesn't incur any downtime. Atlas maintains a primary node during this process, but the nodes are upgraded one-by-one and are unavailable while being upgraded.

To learn about recommendations for scalability, including avoiding resource drift when using infrastructure as code tools with reactive auto-scaling, see Recommendations for Atlas Scalability in the Atlas Architecture Center.

Eligible Clusters for Reactive Auto-Scaling

Atlas cluster tier reactive auto-scaling is available for all dedicated cluster tiers under the General and the Low-CPU cluster classes.

How Atlas Scales Cluster Tier

Note

Tier Availability

Reactive auto-scaling works on cluster tiers in General and Low-CPU classes, but not on clusters in the Local NVMe SSD class.

Atlas analyzes the following cluster metrics to determine when to reactively scale a cluster, and whether to scale the cluster tier up or down:

Normalized System CPU Utilization
System Memory Utilization

Atlas calculates system System Memory Utilization based on available node memory and total memory as follows:

(memoryTotal - (memoryFree + memoryBuffers + memoryCached)) / (memoryTotal) * 100

In the previous calculation, memoryFree, memoryBuffers, and memoryCached are amounts of available memory that Atlas can reclaim for other purposes. To learn more, see System Memory in Review Available Metrics.

Atlas won't scale your cluster tier if the new cluster tier would fall outside of your specified Minimum and Maximum Cluster Size range.

Atlas scales your cluster to another tier in the same class. For example, Atlas scales General clusters to other General cluster classes, but doesn't scale General clusters to Low-CPU cluster classes.

The exact reactive auto-scaling criteria are subject to change in order to ensure appropriate cluster resource utilization.

Important

During a migration, if you restore a snapshot with a larger size than the storage capacity of the destination cluster, the cluster does not automatically scale.

If you deploy read-only nodes and want your cluster to scale faster, consider adjusting your Replica Set Scaling Mode.

Scaling Up a Cluster Tier Reactively

To manage dynamic workloads for your applications, Atlas reactively scales up nodes in your cluster under the conditions described in this section.

If the next cluster tier is within your Maximum Cluster Size range, Atlas scales operational nodes in your cluster up to the next tier if at least one of the following criteria is true for any cluster node of this type.

Note

The following list groups together CPU-related criteria, followed by the memory-related criteria. Within each group, criteria appear in the order from most restrictive to least restrictive, and criteria for specific cloud providers appear first, if they exist.

M10 and M20 clusters:
- AWS. The average normalized Relative System CPU Utilization has exceeded 90% for the past 20 minutes and the average non-normalized Absolute System CPU Utilization for CPU steal has exceeded 30% for the past 3 minutes.
- Azure. The average normalized Relative System CPU Utilization has exceeded 90% for the past 20 minutes and the average non-normalized Absolute System CPU Utilization for softIRQ has exceeded 10% for the past 3 minutes.
- The average normalized Absolute System CPU Utilization has exceeded 90% of resources available to the cluster for the past 20 minutes.
- The average normalized Relative System CPU Utilization has exceeded 75% of resources available to the cluster for the past one hour.
- The average System Memory Utilization has exceeded 90% of resources available to the cluster for the past 10 minutes. To learn how Atlas calculates the amount of system memory utilization, see How Atlas Scales Cluster Tier.
- The average System Memory Utilization has exceeded 75% of resources available to the cluster for the past one hour.
M30+ clusters:
- The average System CPU Utilization has exceeded 90% of resources available to the cluster for the past 10 minutes.
- The average System CPU Utilization has exceeded 75% of resources available to the cluster for the past one hour.
- The average System Memory Utilization has exceeded 90% of resources available to the cluster for the past 10 minutes.
- The average System Memory Utilization has exceeded 75% of resources available to the cluster for the past one hour.

These thresholds ensure that your cluster scales up quickly in response to high loads, and your application can handle spikes in traffic or usage, maintaining its performance and reliability.

Note

The conditions in this section describe operational nodes. For analytics nodes on any cloud provider, Atlas scales them up to the next tier if the average normalized System CPU Utilization or the System Memory Utilization has exceeded 75% of resources available to any cluster node for the past one hour.

To achieve optimal resource utilization and cost profile, Atlas avoids scaling up the cluster to the next tier if:

The M10 or M20 cluster has been scaled up in the past 20 minutes or one hour, depending on thresholds.
The M30+ cluster has been scaled up in the past 10 minutes or one hour, depending on thresholds.

For example, if the cluster tier has not been changed since 12:00, Atlas will scale an M30+ cluster at 12:10, if the cluster's current normalized System CPU Utilization is greater than 90%.

Important

Sudden Workload Spikes

Scaling up to a greater cluster tier requires enough time to prepare backing resources. Automatic scaling may not occur when a cluster receives a burst of activity, such as a bulk insert. To reduce the risk of running out of resources, plan to scale up clusters before bulk inserts and other workload spikes.

Scaling Down a Cluster Tier Reactively

To optimize costs, Atlas reactively scales down nodes in your cluster under the conditions described in this section.

If the next lowest cluster tier is within your Minimum Cluster Size range, Atlas scales the nodes in your cluster down to the next lowest tier if all of the following criteria are true for all nodes of the specified cluster type:

All nodes:
- Atlas hasn't scaled the cluster down (manually or automatically) in the past 24 hours.
- Atlas hasn't provisioned or unpaused the cluster in the past 24 hours.
- Atlas hasn't stopped and restarted any cluster nodes in the past 12 hours.
Operational nodes:
- The average normalized System CPU Utilization is below 45% of resources available to the cluster over at least the last 10 minutes AND the last 4 hours. Atlas uses the "4 hours average" checkpoint as an indication that the CPU load has settled down on the observed level. Atlas uses the "10 minutes average" checkpoint as an indication that no recent CPU spikes have occurred that Atlas didn't capture with the "4 hour average" checkpoint.
- The average WiredTiger cache usage is below 90% of the maximum WiredTiger cache size for at least the last 10 minutes AND the last 4 hours at the current cluster tier size. This indicates to Atlas that the current cluster isn't overloaded.
- The projected total System Memory Utilization at the new lower cluster tier is below 60% for at least the last 10 minutes AND the last 4 hours. Atlas calculates the projected total memory usage mentioned in the preceding statement as follows.
  Atlas measures the current memory usage and replaces the current WiredTiger cache usage size with 80% of the WiredTiger cache size on the new lower tier cluster.
  Next, Atlas checks whether the projected total memory usage would be below 60% for at least the last 4 hours and at least the last 10 minutes on the new tier size.
  Note
  Atlas includes the WiredTiger cache in its memory calculation to make it more likely that clusters with a full cache, but otherwise low traffic, will scale down. In other words, Atlas examines the size of the WiredTiger cache to determine that it can safely down scale an otherwise idle cluster with low Normalized System CPU Utilization in cases where the cluster's WiredTiger caches might reach 90% of the cluster's maximum WiredTiger cache size.
These conditions ensure that Atlas scales down operational nodes in your cluster to prevent high utilization states.
Analytics nodes:
- The average Normalized System CPU Utilization and System Memory Utilization over the past 24 hours is below 50% of resources available to the cluster.
Note
M10 and M20 clusters use lower thresholds to account for caps on CPU usage set by cloud providers after burst periods. These thresholds vary depending on your cloud provider and cluster tier.

Scaling a Sharded Cluster

Atlas auto-scales the cluster tier for sharded clusters using the same criteria as replica sets. Atlas applies the following rules:

For clusters with independent shard scaling enabled, auto-scaling in Atlas evaluates and scales each shard independently. Independent shard scaling requires that the smallest shard size remains no smaller than two cluster tiers below the largest shard, to maintain availability and performance. If Atlas triggers auto-scaling for such a cluster, and scales up the largest shard, it also scales up the smaller shards if necessary to ensure consistent availability and performance.
If the operational or analytics nodes within a shard meet the criteria to auto-scale, only the operational or analytics nodes on that particular shard change tier.
The Config server replica set doesn't auto-scale.

API for Independently Scaling Shards in Clusters

As of API resource version 2024-08-05 of the Versioned Atlas Administration API, you can independently scale the cluster tier of each shard individually. This API version is a significant change to the underlying scaling model of Atlas clusters.

Warning

The 2024-08-05 API version is a significant breaking change. If you send a request with the new API to describe the shards within the cluster asymmetrically, the previous symmetrical-only API will no longer be available for that cluster. To return to a previous API version, first reconfigure the cluster to have all shards operate on the same tier.

The new API is capable of describing asymmetric clusters. The replicationSpec.numShards field is not present in the new API schema. Instead, each shard is specified by a separate replicationSpec, even for symmetric clusters in which all shards are configured the same.

Predictive Auto-Scaling for Cluster Tier

Predictive auto-scaling is an extension of auto-scaling.

Atlas uses demand-forecasting for host resource utilization and performs preemptive scaling up of your cluster compute to ensure optimal resource utilization. With predictive auto-scaling, Atlas attempts to scale up your cluster proactively, ahead of cyclical workload spikes.

Predictive auto-scaling is powered by a machine learning model based on historical patterns. Atlas scales the cluster up if the model predicts high resource utilization. MongoDB updates the model and its criteria continuously to optimize Atlas performance.

Predictive auto-scaling has the following benefits for clusters with predictive, cyclical workloads:

Automatically scale up your cluster for cyclical workload patterns, for example, daily or weekly spikes.
Maintain consistent performance and availability during predictable high-demand periods.
Reduce manual scaling tasks or scheduled scripts by letting Atlas manage capacity increases.
Seamlessly fall back to reactive auto-scaling when changes to the cluster workload fall outside of predictable patterns and are non-cyclical or unpredictable.

Behavior of Predictive Auto-Scaling

The following statements describe how predictive auto-scaling works:

Atlas attempts to scale up your cluster instance size before the forecasted load arrives.
When Atlas scales your cluster predictively based on forecasted metrics, it can scale up by at most two tiers at a time.
Predictive auto-scaling applies only to compute, not storage.
Predictive auto-scaling respects existing auto-scaling minimum and maximum instance sizes.
In cases when Atlas can't use predictive auto-scaling to scale up the cluster, it falls back to using reactive auto-scaling.
Predictive auto-scaling only supports upscaling. There is no predictive down-scaling. Atlas uses reactive auto-scaling to automatically scale down the cluster when the workload decreases.
If predictive up-scaling is scheduled to happen within the next 1 hour, Atlas skips reactive down-scaling.
If you use independent shard scaling, and add one or more shards after predictive auto-scaling was already enabled and active on the cluster, these new shards don't auto-scale predictively until their workload patterns have been established, two weeks later. In the meantime, these shards use reactive auto-scaling behavior in Atlas.

Eligible Clusters for Predictive Auto-Scaling

Atlas uses predictive auto-scaling for eligible clusters. Eligible clusters for predictive auto-scaling must meet all of the following criteria:

Belong to General and Low-CPU cluster classes.
Have a tier that is greater than M30.
Have auto-scaling enabled. If you enable down-scaling, auto-scaling minimum instance size should be equal to or greater than M30.
Have been active for at least two weeks.
Not use NVMe storage or belong to the Local NVMe SSD cluster class.

In addition, the following criteria affect whether Atlas uses predictive auto-scaling for an eligible cluster:

Predictive auto-scaling applies only to electable and read-only nodes. Atlas doesn't use predictive auto-scaling for search or analytics nodes.
Predictive auto-scaling might not be able to predict non-cyclical and highly dynamic workload spikes in any eligible cluster. In these cases, Atlas relies on reactive auto-scaling.

How Atlas Scales Cluster Storage

Atlas enables cluster storage auto-scaling by default. Atlas automatically increases cluster storage when disk space used reaches 90% for any node in the cluster.

To opt out of cluster storage scaling, un-check the Storage Scaling checkbox in the Auto-scale section.

The following considerations apply:

Atlas auto-scales cluster storage up only. You can manually reduce your cluster storage from the Edit Cluster page.
On AWS, Azure, and GCP clusters, Atlas increases cluster storage capacity to achieve 70% disk space used. To learn more, see Change Storage Capacity or IOPS on AWS, Change Storage Capacity and IOPS on Azure, and Change Storage Capacity on Google Cloud.
Avoid high-speed write activity if you plan to scale up clusters. Scaling up a cluster to greater storage capacity requires sufficient time to prepare and copy data to new disks. If a cluster receives a burst of high-speed write activity, such as a bulk insert, automatic scaling might not occur due to a temporary spike in disk storage capacity. To reduce the risk of running out of disk storage, plan to scale up clusters in advance of bulk inserts and other instances of high-speed write activity.
Atlas disables disk auto-scaling if you specify one cluster tier class for the base nodes and another, different cluster tier class for the analytics nodes. For example, if you specify a General cluster class for operational nodes in the Base Tier, and a Low-CPU cluster class for analytics nodes in the Analytics Tier, Atlas disables disk auto-scaling with the following error message: Disk auto-scaling is not yet available for clusters with mixed instance classes.
Atlas disables disk reactive auto-scaling if you deploy the Base Tier and Analytics Tier nodes in different cloud provider regions.

Cluster Tier and Cluster Storage Might Scale in Parallel

When Atlas attempts to automatically scale your cluster storage capacity as part of auto-scaling, it might need to scale your storage outside of the bounds that your current cluster tier supports. To help ensure that your cluster doesn't experience any downtime, Atlas scales your cluster tier (in addition to cluster storage) to accommodate the new storage capacity.

On Azure, if you enable auto-scaling on a cluster deployed in one of the regions that support extended storage, and the current IOPS is lower than the default IOPS for the auto-scaled disk size, Atlas increases the alloted number of IOPS in the IOPS slider and notifies you in the UI. To learn more, see Change Storage Capacity and IOPS on Azure.

Example

The maximum storage capacity for an M30 cluster is 480 GB. If you have an M30 cluster with the maximum storage allocated and your disk space used reaches 90%, a storage auto-scaling event requires raising your storage capacity to 600 GB. In this case, Atlas scales your cluster tier up to M40 because this is the lowest cluster tier that can support the new required storage capacity. On Azure, if you deployed the cluster in one of the regions that support extended storage, Atlas also automatically increases IOPS to match the IOPS level for that tier's cluster.

In the event that your specified maximum cluster tier can't support the new storage capacity, Atlas:

Raises your maximum cluster tier to the next lowest tier that can accommodate the new storage capacity.
Scales your cluster tier to that new maximum tier.

Note

When Atlas overrides your maximum cluster tier, it also disables your cluster from automatically scaling down. To re-enable downward auto-scaling, configure it in Cluster Settings. See also Considerations for Downward Auto-Scaling of Cluster Tier and Storage.

If Atlas attempts to scale your cluster tier down and the target tier can't support your current disk capacity, provisioned IOPS, or both, Atlas doesn't scale your cluster down. In this scenario, Atlas updates your auto-scaling settings based on the relationship between your current cluster tier and the configured maximum cluster tier:

If the cluster is currently at the configured maximum cluster tier, Atlas disables the cluster from automatically scaling down because all smaller tiers wouldn't be able to accommodate the necessary storage settings. If you want to re-enable downward auto-scaling, you must do so manually from your Cluster Settings.
If the cluster isn't currently at the configured maximum cluster tier, Atlas raises the minimum cluster tier to the current cluster tier. In this case, Atlas doesn't disable downward auto-scaling.

This auto-scaling logic reduces the downtime in cases when your storage settings don't match your workload.

Oplog Considerations

Depending on whether you choose to use storage auto-scaling, Atlas manages the oplog entries based on either the minimum oplog retention window, or the oplog size. To learn more, see Oplog Size Behavior. Atlas enables storage auto-scaling by default.

Considerations for Downward Auto-Scaling of Cluster Tier and Storage

When Atlas scales down the storage capacity of your cluster, this can take longer than expanding storage capacity due to the mechanics of the scaling process.
Estimate your deployment's range of workloads and then set the Minimum Cluster Size value to the cluster tier that has enough capacity to handle your deployment's workload. Account for any possible spikes or dips in cluster activity.
You can't scale to a cluster tier smaller than M10.
You can't select a minimum cluster tier that is below the current disk configuration of your cluster. If your storage increases beyond what is supported by your minimum cluster tier, and Atlas increases your clusters storage configuration beyond what your minimum cluster tier supports, then Atlas automatically adjusts your minimum cluster tier to a tier that supports the current storage requirements of your cluster.
Example
You have set your auto-scaling bounds to M20 - M60 and your current cluster tier is M40 with a disk capacity of 200GB. Atlas triggers a disk auto-scaling event to increase capacity to 320GB because current disk usage exceeds 180GB, which is more than 90% of the 200GB capacity.
Atlas takes the following actions:
1. Raises your minimum cluster tier to the next lowest tier, M30, that can accommodate the new storage capacity. M20 supports a maximum storage capacity of 256GB, so it is no longer a valid auto-scaling bound.
2. Determines that the current instance size, M40, supports the new disk configuration. The disk auto-scaling event succeeds.

Configure Auto-Scaling Options

You can configure auto-scaling options when you create or modify a cluster. For new clusters, Atlas automatically enables cluster tier auto-scaling and storage auto-scaling.

You can do one of the following:

Review and adjust the upper and lower cluster tiers that Atlas should use when auto-scaling your cluster, or
Opt out of using auto-scaling.

Atlas displays auto-scaling options in the Auto-scale section of the cluster builder for General and Low-CPU tier clusters.

Auto-Scaling Enabled by Default

When you create a new cluster, Atlas enables auto-scaling (predictive and reactive) for cluster tier and cluster storage. (Predictive auto-scaling only affects cluster tier and doesn't affect storage.) You don't need to explicitly enable auto-scaling. If you prefer, you can opt out for cluster tier and cluster storage.

Note

Atlas enables cluster tier auto-scaling by default when you create clusters in the Atlas UI. If you create clusters with the API cluster auto-scaling isn't selected by default and you must explicitly enable it, using the options in the autoScaling object of the Update One Cluster in One Project endpoint.

With auto-scaling enabled, your cluster can automatically:

Scale up to increase capability with a higher cluster tier, using either reactive or predictive auto-scaling, depending on cluster and workload eligibility.
Decrease the current cluster tier to a lower cluster tier using reactive auto-scaling.

In the Cluster tier section of the Auto-scale options, you can specify the Maximum Cluster Size and Minimum Cluster Size values that your cluster can automatically scale to. Atlas sets these values as follows:

The Maximum Cluster Size is set to one tier above your current cluster tier.
The Minimum Cluster Size is set to the current cluster tier.

In addition, Atlas might use predictive auto-scaling if your cluster is eligible and its workload is cyclical and predictable.

Review the Cluster Tier Auto-Scaling Options

To review the enabled auto-scaling options for cluster tier and storage:

In the selected Auto-Scale checkbox, review the Maximum Cluster Size and Minimum Cluster Size values, and adjust them if needed.
Review the Allow cluster to be scaled down option that is checked by default when you create a new cluster.
Review the options under the Storage Scaling checkbox that is checked by default.

Opt Out of Cluster Tier Auto-Scaling

To opt out of cluster auto-scaling (increasing the cluster tier), when creating a new cluster, navigate to the Cluster Tier menu, and un-check the Cluster Tier Scaling checkbox in the Auto-scale section.

To opt out of cluster auto-scaling (decreasing the cluster tier), when creating a new cluster, navigate to Cluster Tier menu, and un-check the Allow cluster to be scaled down checkbox in the Auto-scale section.

Review Auto-Scaling Activity Feed

You can view Activity Feed to review the events for each Atlas project. When any auto-scaling event occurs, Atlas logs the event in the project Activity Feed.

Atlas uses the following audit auto-scaling events.

To view or download only auto-scaling events:

In Atlas, go to the Project Activity Feed page.
1. If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.
2. If it's not already displayed, select your desired project from the Projects menu in the navigation bar.
3. Click the Activity Feed & Alerts icon in the navigation bar.
4. Click Activity Feed under the Project header.
  The Project Activity Feed page displays.
In the Activity Feed, click the Filter by event(s) menu and check Atlas.
In the search box above the list, start typing auto-scaling.
In the right-hand side of the menu, all auto-scaling events display. Deselect any that you don't want to see. The feed list automatically updates with each change you make.

Configure Alerts for Auto-Scaling Events

Important

In early August 2024, Atlas replaced legacy auto-scaling notification emails with configurable auto-scaling events. By default, Atlas continues to send all alert notifications to the project owners. You can customize your auto-scaling alert distribution to change alert recipients or a distribution method.

Auto-scaling activities are a subset of Atlas alerts.

Each time Atlas triggers any of the auto-scaling events, you receive default Atlas alerts.

You can opt out of or change alert configuration for some or all auto-scaling events at a project level.

To modify an alert configuration, in the Category section, select Atlas Auto Scaling and then select the Condition/Metric from the list. You can then modify roles for alert recipients, change a notification method, such as email or SMS, and add a notifier, such as Slack. To learn more, see Configure an Auto-Scaling Alert.

MongoDB Support for Atlas Auto-Scaling

If you have any questions or concerns, contact support.

Back

Cluster Sharding

Write-Blocking