All Products
Search
Document Center

DataWorks:Solution 6: Connect to a data source over the Internet

Last Updated:Sep 15, 2025

This topic uses a MySQL instance with a public endpoint as an example to describe how to connect a data source to DataWorks over the Internet.

Scenarios

If your data source meets the following condition, we recommend that you use this solution:

  • A public endpoint is configured for the data source.

Solution description

  • By default, serverless resource groups cannot access the Internet. If you want to use a serverless resource group to access a data source that is deployed on the Internet, you must configure an Internet NAT gateway for the VPC with which the resource group is associated and associate an EIP with the Internet NAT gateway.

  • Old-version resource groups can access the Internet. If you use an old-version resource group, you can directly establish a network connection between the resource group and the data source.

    Note

    Old-version resource groups are being phased out. We recommend that you use serverless resource groups.

Network connectivity diagram

幻灯片9

Prerequisites

Billing

To use a serverless resource group, you must configure an Internet NAT gateway for its VPC and attach an Elastic IP Address (EIP). For more information about the billing of Internet NAT gateways and EIPs, see NAT Gateway billing and EIP billing overview.

Configure network connectivity

Note

This topic describes the general procedure for configuring network connectivity between a data source and DataWorks to help you quickly understand the core logic. For more information about the configuration, see the Configuration examples in this topic.

Step 1: Obtain basic information

Data source side

  • Public IP address of the server where the data source resides

    Connect to the server where the data source resides to obtain its public IP address. Alternatively, you can contact the network administrator to obtain the public IP address.

DataWorks side

Serverless resource group

Information about the VPC and vSwitch bound to the resource group

  1. Go to the Resource Group page in the DataWorks console. Find the resource group that you want to manage and click Network Settings in the Actions column.

  2. In the corresponding feature module, you can view the attached VPC and VSwitch.

    For example, to synchronize data from a MySQL database that has a public endpoint to DataWorks, you can view the corresponding VPC and VSwitch Information under Data Scheduling & Data Integration.

    image

Legacy exclusive resource group

EIP address of the resource group

  1. Go to the Resource Group page in the DataWorks console. Find the target resource group and click Details in the Actions column to open the resource group details page.

  2. Obtain the EIP address.

    image

Step 2: Establish a network connection

  • Serverless resource group: By default, serverless resource groups cannot access the public network. To enable access to data sources on the public network, you must configure an Internet NAT gateway and an EIP for the VPC that is attached to the resource group.

  • Legacy exclusive resource group: Legacy exclusive resource groups can access the Internet directly.

Note

If errors occur when you configure network connectivity, submit a ticket to contact technical support of the related Alibaba Cloud service.

Step 3: (Optional) Add an IP address to a whitelist

If the data source is controlled by a whitelist, add the public IP address of the resource group to the whitelist to allow access.

This topic uses a MySQL IP address whitelist as an example. This topic describes how to specify that a user can access the database only from the public IP address that is bound to the resource group.

  1. Log on to the database as an administrator.

  2. Create an account to access the data source from DataWorks and grant permissions to the account.

    -- "dataworks_user" is the username. You can customize it. -- "StrongPassword123!" is the password. You can customize it. CREATE USER 'dataworks_user'@'<Public IP address bound to the resource group>' IDENTIFIED BY 'StrongPassword123!'; -- Grant the user permissions to access a specific database, such as mydatabase, from the public IP address bound to the resource group. GRANT ALL PRIVILEGES ON mydatabase.* TO 'dataworks_user'@'<Public IP address bound to the resource group>' WITH GRANT OPTION;
  3. Run the FLUSH PRIVILEGES; command to refresh the permissions and then exit the database (exit).

Test the network connectivity

  1. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Integration > Data Integration. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.

  2. In the navigation pane on the left, click Data Source. On the Data Sources page, click Add Data Source. Select a data source type and configure the connection parameters as required.

  3. In the resource group list at the bottom, select the resource group that is connected to the data source and click Test Connectivity.image

    Note

    If the connectivity test result is Failed, you can use the Connectivity Diagnosis Tool to troubleshoot the issue. If you still cannot connect the resource group to the data source, please submit a ticket.

Configuration example

This section provides an example of how to configure network connectivity. In this example, a MySQL instance with a public endpoint and a DataWorks workspace in the China (Shanghai) region are used.

1. Basic information

Parameter

Data source (RDS MySQL)

DataWorks resource group

Region

-

China (Shanghai)

Network information

  • Public IP address: 47.117.XX.XX

  • VPC name: vpc-shanghai

  • vSwitch: sh-l

image

2. Establish a network connection

This solution applies only to serverless resource groups. You can use an Internet NAT gateway to enable public network access for the VPC that is bound to the resource group. Legacy resource groups are bound to EIPs by default and do not require this configuration.

Note

If errors occur when you configure network connectivity, submit a ticket to contact technical support of the related Alibaba Cloud service.

  1. In the DataWorks console, go to the Resource Group page. Find the target resource group and click Network Settings in the Actions column.

  2. In the corresponding feature module, find the bound VPC and click image next to its name to open the Basic Information page.

    For example, to connect a MySQL instance with a public endpoint to DataWorks for data synchronization, go to the Data Scheduling & Data Integration section, find the VPC, and click the image icon next to its name.

    image

  3. Switch to the Resource Management tab. In the Public Network Access Service area, under Internet NAT Gateway, click Create Now to enable public network access for the VPC bound to the resource group.

    Configure the following key parameters:

    Parameter

    Value

    VPC

    Select the same VPC and vSwitch that are bound to the resource group.

    Associated VSwitch

    Access Mode

    Select SNAT-enabled Mode.

    Elastic IP Address Instance

    Select Purchase Elastic IP.

    Create Service-linked Role

    If you are creating a NAT gateway for the first time, you must create a service-linked role. Click Create Service-linked Role.

  4. Click Buy Now and complete the payment to create the NAT Gateway instance.

    image

  5. After the NAT Gateway instance is created, click Return to Console to create an SNAT entry for the new NAT Gateway instance.

    Note

    Resources in the VPC can access the Internet only after an SNAT entry is configured.

    1. Click Manage in the Actions column for the new instance. On the management page of the NAT Gateway instance, click the SNAT Management Tab.

    2. In the SNAT Entry List section, click Create SNAT Entry. The key parameters are described below:

      Parameter

      Value

      SNAT Entry Granularity

      Select VPC Granularity. This ensures that all resource groups in the VPC of the NAT gateway can access the internet through the configured EIP.

      Select Elastic IP Address

      Select the EIP that is bound to the current NAT Gateway instance.

      After you configure the parameters for the SNAT entry, click OK to create the entry.

    In the SNAT Entry List, when the Status of the new SNAT entry changes to Available, the VPC bound to the resource group can access the internet.

3. Add an IP address to a whitelist

  1. Obtain the public IP address that is bound to the resource group.

    Serverless resource group

    1. Go to the VPC console. In the navigation pane on the left, choose NAT Gateway > Internet NAT Gateway.

    2. Find the Internet NAT gateway that you created. In the Elastic IP Address column, you can find the Elastic IP Address.

      image

    Legacy resource group

    1. Go to the Resource Group page in the DataWorks console. Find the target resource group and click Details in the Actions column to go to the details page of the resource group.

    2. Obtain the EIP address.

      image

  2. Log on to the database as an administrator.

  3. Create an account to access the data source from DataWorks and grant permissions to the account.

    -- "dataworks_user" is the username. You can customize it. -- "StrongPassword123!" is the password. You can customize it. CREATE USER 'dataworks_user'@'<Public IP address bound to the resource group>' IDENTIFIED BY 'StrongPassword123!'; -- Grant the user permissions to access a specific database, such as mydatabase, from the public IP address bound to the resource group. GRANT ALL PRIVILEGES ON mydatabase.* TO 'dataworks_user'@'<Public IP address bound to the resource group>' WITH GRANT OPTION;
  4. Run the FLUSH PRIVILEGES; command to refresh the permissions and then exit the database (exit).

4. Test the network connectivity

  1. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Integration > Data Integration. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.

  2. In the navigation pane on the left, click Data Source. On the Data Source List page, click Add Data Source.

  3. Select the MySQL data source type and configure its parameters.

    • Set Configuration Mode to Connection String Mode.

    • Set Host Address ID to the public IP address of the MySQL server. In this example, the IP address is 47.117.XX.XX.

    • Set Port Number to 3306.

    • For Database Name, specify the name of an existing database.

    • Username and Password: Enter the username and password for the dataworks_user account that you created in the 3. Configure the IP address whitelist of the data source step.

  4. In the Connection Configuration section, click Test Connectivity for the resource group bound to the workspace. Verify that the result is Connected.

    image

    Note

    If the test fails, you can click Self-service Troubleshoot to resolve the issue. If the test still fails after troubleshooting, submit a ticket.