To run tasks in DataWorks, such as data synchronization and scheduling, you must ensure network connectivity between the virtual private cloud (VPC) attached to your resource group and your data source. A data source can be a database, a DataService Studio service, or other data. This topic describes network connectivity solutions for different network environments.
Background information
Most DataWorks features, such as creating data sources, data synchronization, DataAnalysis, data collection, and DataService Studio, operate on connected data sources or computing resources. If the data source that you need to access is not in the VPC that is attached to the current DataWorks resource group (for example, if the data source is in another VPC or an on-premises data center (IDC)), you must select a suitable network connectivity solution to connect the network of the data source to the VPC that is attached to the DataWorks resource group.
For example, during data synchronization, the VPC attached to the resource group must be connected to both the source and destination networks.
Prerequisites
You have purchased a resource group with the appropriate specifications. For more information, see Use a Serverless resource group.
For more information about resource groups, see Overview of DataWorks resource groups.
The network connectivity solutions in this topic apply only to Serverless resource groups and the following types of legacy resource groups: exclusive resource groups for Data Integration , exclusive resource groups for scheduling , and exclusive resource groups for DataService Studio .
Precautions
You can attach a VPC to a Serverless resource group to access data sources or addresses in complex network environments over an internal network. Note that Serverless resource groups do not have public network access by default . To access a data source or network over the Internet, you must configure an Internet NAT gateway and elastic IP addresses (EIPs) for the VPC attached to the Serverless resource group. For more information, see Connect to a data source over the Internet.
The speed and stability of tasks are not guaranteed over the Internet. For data synchronization, use an internal network or Cloud Enterprise Network (CEN).
Connectivity between the resource group and the data source is a prerequisite for successful task execution.
Data interaction between resource groups and classic network environments is not supported. You must migrate your data sources or services from a classic network to a VPC environment.
Network connectivity solutions
The network connectivity solution that you choose depends on the network environments of your data source and your DataWorks workspace resource group. Select a solution based on your needs:
Solution 1: Connect to an Alibaba Cloud data source (same account and region)
Scenarios
Use this solution if your data source and DataWorks workspace meet the following conditions:
The data source is an Alibaba Cloud product.
The data source and the DataWorks workspace belong to the same Alibaba Cloud account.
The data source and the DataWorks workspace are in the same region.
Solution description
In a scenario with the same account and region, you can use a VPC connection. You can deploy the DataWorks workspace resource group and the data source in the same VPC to enable network communication.
Network connectivity diagram
Configure network connectivity
For more information about the solution and procedure, see Connect to a data source in the same account and region.
Solution 2: Connect to an Alibaba Cloud data source (same account, different regions)
Scenarios
Use this solution if your data source and DataWorks workspace meet the following conditions:
The data source is an Alibaba Cloud product.
The data source and the DataWorks workspace belong to the same Alibaba Cloud account.
The data source and the DataWorks workspace are in different regions.
Solution description
In a scenario with the same account but different regions, you can use a VPC connection. You can use a network connectivity tool, such as Cloud Enterprise Network (CEN) or a VPC peering connection, to connect the VPC of the DataWorks workspace resource group to the VPC of the data source. This enables network communication.
Network connectivity diagram
Configure network connectivity
For more information about the solution and procedure, see Connect to a data source in the same account but different regions.
Solution 3: Connect to an Alibaba Cloud data source (different accounts)
Scenarios
Use this solution if your data source and DataWorks workspace meet the following conditions:
The data source is an Alibaba Cloud product.
The data source and the DataWorks workspace belong to different Alibaba Cloud accounts.
Solution description
In a scenario that involves different accounts, you can use a VPC connection. You can use a network connectivity tool, such as Cloud Enterprise Network (CEN) or a VPC peering connection, to connect the data source of Account A to the DataWorks workspace resource group of Account B. This enables network communication.
Network connectivity diagram
Configure network connectivity
For more information about the solution and procedure, see Connect to a data source in a different account.
Solution 4: Connect to a data source deployed on an ECS instance
Scenarios
Use this solution if your data source meets the following condition:
The data source is deployed on an Alibaba Cloud ECS instance.
Solution description
If the ECS instance with the data source and the DataWorks workspace are in the same account and region , you can use a VPC connection. You can deploy the DataWorks workspace resource group and the ECS instance in the same VPC to enable network communication.
If the ECS instance with the data source and the DataWorks workspace are in different accounts or the same account but different regions , you can use a VPC connection. You can use a network connectivity tool, such as Cloud Enterprise Network (CEN) or a VPC peering connection, to connect the VPC of the DataWorks workspace resource group to the VPC of the ECS instance. This enables network communication.
Network connectivity diagram
Same account and region
Same account, different regions
Different accounts
Configure network connectivity
For more information about the solution and procedure, see Connect to a self-managed data source on an ECS instance.
Solution 5: Connect to an IDC data source
Scenarios
Use this solution if your data source meets the following condition:
The data source is deployed in an on-premises data center (IDC).
Solution description
If your data source is in an on-premises data center (IDC), you can use a VPC connection. You can use a network connectivity tool, such as Express Connect, to connect the on-premises network of the data source to the VPC of the DataWorks workspace resource group. This enables network communication.
Network connectivity diagram
Configure network connectivity
For more information about the solution and procedure, see Connect to an on-premises IDC data source.
Solution 6: Connect to a data source over the Internet
Scenarios
Use this solution if your data source meets the following condition:
The data source has a public endpoint.
Solution description
Serverless resource groups do not have public network access by default. To access a data source over the Internet, you must configure an Internet NAT gateway and EIPs for the VPC attached to the resource group.
Legacy resource groups have public network access and can connect directly.
NoteLegacy resource groups are being phased out. We recommend that you use Serverless resource groups.
Network connectivity diagram
The diagram applies only to Serverless resource groups. Legacy resource groups have EIPs attached by default and can connect directly.
Configure network connectivity
For more information about the solution and procedure, see Connect to a data source over the Internet.
References
For more information about resource groups, see Overview of DataWorks resource groups.
To create and use a resource group, see Use a Serverless resource group.
To attach a VPC to a resource group, see Attach a virtual private cloud.
You can configure an Internet NAT gateway for the VPC and vSwitch attached to the resource group. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet.
For answers to frequently asked questions about network connectivity, see Resource group operations and network connectivity.