When DataWorks features such as Data Integration, DataService Studio, metadata acquisition, and DataAnalysis access a data source that uses a whitelist for access control, you must add the outbound IP addresses or CIDR blocks of these features to the data source's whitelist. This ensures that the features can run as expected.
Background information
DataWorks features use different network paths to access data sources. Therefore, if you enable whitelist control for a data source, you must grant permissions based on the specific feature:
Resource group-related features, such as Data Integration and DataService Studio: These features use the resource group's network to access the data source. To enable task execution, add the vSwitch CIDR block or public IP address of the resource group to the data source's whitelist.
Platform service features, such as metadata acquisition and DataAnalysis: These features use DataWorks-maintained service nodes to initiate access requests. These nodes are separate from the resource group network. Therefore, you must add the platform's dedicated IP CIDR blocks to the data source's whitelist. This ensures that the whitelist includes all outbound nodes in the access path and prevents feature failures that are caused by missing whitelist authorization.
Prerequisites
A network connection is established between the data source and the resource group. For more information, see Overview of network connectivity solutions.
Obtain the whitelist
Whitelist for attaching computing resources and adding data sources
Serverless resource group
Get the internal IP CIDR block of the resource group
This section applies to scenarios where the data source and DataWorks are connected over an internal network. You must add the IP addresses in the vSwitch CIDR block that is attached to the resource group to the data source's whitelist.
Go to the Resource Groups page in the DataWorks console. In the top navigation bar, select the region where the target resource group is located, and then find the target resource group in the list.
In the Actions column of the target resource group, click Network Settings to open the VPC Binding page.
In the Data Scheduling & Data Integration section, view the VSwitch CIDR Block.
Add the vSwitch CIDR block that you found to the data source's whitelist.
Get the public IP address of the resource group
This section applies to scenarios where the data source and DataWorks are connected over the Internet. You must add the elastic IP addresses (EIPs) of the resource group to the data source's whitelist.
By default, serverless resource groups do not provide public network access. To access data sources over the public network, you must configure an Internet NAT gateway and EIPs for the VPC that is attached to the resource group.
Go to the Resource Groups page in the DataWorks console. In the top navigation bar, select the region where the target resource group is located, and then find the target resource group in the list.
In the Actions column for the target resource group, click Network Settings to open the VPC Binding page.
In the Data Scheduling & Data Integration section, find the attached VPC and click the
icon next to it to open its Basic Information page.
Switch to the Resource Management tab. In the Public Network Access Service area, click the number under Internet NAT Gateway to view the list of Internet NAT gateways for the VPC.
On the Internet NAT gateways page, view the attached EIPs.
Add the EIPs that you found to the data source's whitelist.
Exclusive resource group for Data Integration (Legacy)
Get the internal IP CIDR block of the resource group
This section applies to scenarios where the data source and DataWorks are connected over an internal network. You must add the IP addresses in the vSwitch CIDR block that is attached to the resource group to the data source's whitelist.
Go to the Resource Groups page in the DataWorks console. In the top navigation bar, select the region where the target resource group is located, and then find the target resource group in the list.
In the Actions column for the target resource group, click Network Settings to open the VPC Binding page.
Find the VPC attached to the resource group and view its VSwitch CIDR Block.
Add the vSwitch CIDR block that you found to the data source's whitelist.
Get the public IP address of the resource group
This section applies to scenarios where the data source and DataWorks are connected over the Internet. You must add the elastic IP addresses (EIPs) of the resource group to the data source's whitelist.
Go to the Resource Groups page in the DataWorks console. In the top navigation bar, select the region where the target resource group is located, and then find the target resource group in the list.
For the target resource group, click Details in the Actions column.
You can obtain an EIP address.
Add the EIP address that you found to the data source's whitelist.
Shared resource group for Data Integration
If you use the legacy shared resource group for Data Integration, you must add the IP addresses listed in Whitelist for the shared resource group for Data Integration to the data source's whitelist.
Whitelist for DataService Studio
Serverless resource group
Get the internal IP CIDR block of the resource group
This section applies to scenarios where the data source and DataWorks are connected over an internal network. You must add the IP addresses in the vSwitch CIDR block that is attached to the resource group to the data source's whitelist.
Go to the Resource Groups page in the DataWorks console. In the top navigation bar, select the region where the target resource group is located, and then find the target resource group in the list.
In the Actions column of the target resource group, click Network Settings to open the VPC Binding page.
Under DataService Studio, you can view the corresponding VSwitch CIDR Block.
NoteIf a VPC and vSwitch are not attached to DataService Studio, click Add Binding. After they are attached, obtain the vSwitch CIDR block.
Add the vSwitch CIDR block that you found to the data source's whitelist.
Get the public IP address of the resource group
This section applies to scenarios where the data source and DataWorks are connected over the Internet. You must add the elastic IP addresses (EIPs) of the resource group to the data source's whitelist.
By default, serverless resource groups do not provide public network access. To access data sources over the public network, you must configure an Internet NAT gateway and EIPs for the VPC that is attached to the resource group.
Go to the Resource Groups page in the DataWorks console. In the top navigation bar, select the region where the target resource group is located, and then find the target resource group in the list.
In the Actions column for the target resource group, click Network Settings to open the VPC Binding page.
In DataService Studio, find the attached VPC and click the
icon next to the VPC to open the Basic Information page.
Switch to the Resource Management tab. In the Public Network Access Service area, click the number under Internet NAT Gateway to view the list of Internet NAT gateways for the VPC.
On the Internet NAT gateways page, view the attached EIPs.
Add the EIPs that you found to the data source's whitelist.
Exclusive resource group for DataService Studio (Legacy)
Get the internal IP CIDR block of the resource group
This section applies to scenarios where the data source and DataWorks are connected over an internal network. You must add the IP addresses in the vSwitch CIDR block that is attached to the resource group to the data source's whitelist.
Go to the Resource Groups page in the DataWorks console. In the top navigation bar, select the region where the target resource group is located, and then find the target resource group in the list.
For the target resource group, click Details in the Actions column.
Identify the VSwitch that is attached to the resource group. Then, go to the Virtual Private Cloud (VPC) console, search for the vSwitch, and obtain its IPv4 CIDR Block.
Add the vSwitch CIDR block that you found to the data source's whitelist.
Public DataService Studio resource group
If you use the shared resource group for DataService Studio, you must add the IP addresses listed in Whitelist for the shared resource group for DataService Studio to the data source's whitelist.
Whitelist for metadata acquisition
If a data source used for metadata acquisition has a whitelist for access control, you must add the IP addresses listed in the metadata acquisition whitelist to that data source's whitelist.
Whitelist for DataAnalysis
If whitelist control is enabled for the target MaxCompute project used for DataAnalysis, you must add the IP addresses listed in the DataAnalysis whitelist to the MaxCompute project's whitelist.
Add a whitelist
If your data source is an Alibaba Cloud product, see the relevant document to add the IP addresses that you found to the data source's whitelist:
The following table provides links to documents about how to configure whitelists for some common Alibaba Cloud products. For data sources that are not listed, see their official documentation.
If your data source is not an Alibaba Cloud product, see its official documentation to learn how to configure a whitelist.
Configure a whitelist for Cloud-native Database PolarDB Distributed Edition | |
Configure an Internet whitelist for OpenSearch Vector Search Edition | |
References
For answers to frequently asked questions (FAQ) about network connectivity, see Resource group operations and network connectivity.
For answers to frequently asked questions about adding a whitelist, see Frequently asked questions about adding a whitelist.