When you use Data Integration to synchronize data from an Alibaba Cloud instance such as RDS, Hive, or Kafka, you must configure cross-account authorization for that instance. This is required if you select Alibaba Cloud Instance Mode as the data source type and the instance belongs to a different Alibaba Cloud account than the one used for the sync task. You must grant the Alibaba Cloud account that runs the sync task read permissions on the instance to enable data synchronization.
Background information
When you add a data source, you can select Alibaba Cloud Instance Mode as the data source type. If the instance and the DataWorks workspace belong to different Alibaba Cloud accounts, you must configure cross-account authorization as described in this topic.
Prerequisites
A network connection solution, such as Cloud Enterprise Network (CEN), is configured to allow communication between the VPC of the data source instance and the VPC of the DataWorks resource group. For more information, see Network connection solutions.
Cross-account configuration flow
The cross-account authorization configuration flow for an RDS, Hive, or Kafka data source is as follows:
Operations in the Alibaba Cloud account that owns the data source instance
Log on to the Resource Access Management (RAM) console and go to the Roles page to create a RAM role. For more information, see Create a RAM role for a trusted Alibaba Cloud account.
Key parameters:
Set Select Trusted Entity to Alibaba Cloud Account.
Enter a custom RAM Role Name.
Set Select Account to Other Alibaba Cloud Account and enter the UID of the Alibaba Cloud account that owns the DataWorks workspace.
Grant permissions to the RAM role. For more information, see Grant permissions to a RAM role.
Key parameters:
For Authorization Policy, select System Policy.
For Policy Name, see the following table.
Instance type
Policy Name
RDS (MySQL, SQL Server, PostgreSQL, MariaDB)
AliyunDataWorksAccessingRdsReadOnlyPolicy
Hive
AliyunDataWorksAccessingDLFReadOnlyPolicy, AliyunDataWorksAccessingEMRReadOnlyPolicy
Kafka
AliyunDataWorksAccessingAlikafkaPolicy
Modify the trust policy of the RAM role. For more information, see Modify the trusted entity of a RAM role.
Trust policy:
{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "Service": [ "<UID of the Alibaba Cloud account that owns the DataWorks workspace>@cdp.aliyuncs.com" ] } } ], "Version": "1" }
NoteReplace
<UID of the Alibaba Cloud account that owns the DataWorks workspace>
with the UID of the Alibaba Cloud account that owns your DataWorks workspace.
Operations in the Alibaba Cloud account that owns DataWorks
Go to the Data Integration page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.
Add an RDS, Hive, or Kafka data source.
Key parameters:
Parameter
Description
Data Source Type
Select Alibaba Cloud Instance Mode.
Account Of Instance
Select Other Cloud Account or Other Alibaba Cloud Account.
NoteSelect an option based on your data source configuration.
UID of Other Alibaba Cloud Account
Enter the UID of the Alibaba Cloud account that owns the RDS, Hive, or Kafka instance.
RAM Role for Authorization
Enter the name of the RAM role that you created.
Test the network connectivity.