To develop and manage MaxCompute tasks in DataWorks, you must first bind a MaxCompute project as a computing resource to your DataWorks workspace. After the project is bound, you can use the computing resource in DataWorks to connect to the MaxCompute project and perform operations such as data synchronization, data development, and data analysis.
Limits
Regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), UK (London), US (Silicon Valley), and US (Virginia).
You can bind a MaxCompute project as a computing resource only if the project and the DataWorks workspace are in the same region and belong to the same Alibaba Cloud account.
Permissions:
Product
Operator
Required permissions
DataWorks
Alibaba Cloud account
No additional permissions are required.
RAM user/RAM role
Only workspace members who have the O&M and Workspace Administrator roles or the
AliyunDataWorksFullAccess
permission can create computing resources. You can grant permissions to members in the workspace.MaxCompute
RAM user/RAM role
When you bind a computing resource: The operator must have the odps:ListProjects permission for MaxCompute and the Super_Administrator permission for the target MaxCompute project.
When used as the default access identity: The identity must have the admin or super_administrator permission for the MaxCompute project. After the computing resource is bound, this account or role is added to the MaxCompute production project and granted the Role_Project_Scheduler role.
Production data in the current workspace is owned by the default access identity for the production environment that is specified when the computing resource is created. To perform operations on or access production tables, other accounts must apply for the required permissions in Security Center.
Prerequisites
MaxCompute is activated in the same region as the DataWorks workspace, and a MaxCompute project is created.
A workspace is created in DataWorks. The Resource Access Management (RAM) user that you use to perform these operations has been added to the workspace and granted the Workspace Administrator role.
NoteDataWorks provides basic mode and standard mode workspaces. When you create a workspace, be aware of the differences between basic mode and standard mode.
A resource group is attached to the workspace, and network connectivity is established.
If you use a Serverless resource group, make sure that the MaxCompute computing resource can connect to the Serverless resource group.
If you use a legacy exclusive resource group, make sure that the MaxCompute computing resource can connect to the exclusive resource group for scheduling for the specific scenario.
New Data Development: Bind a MaxCompute computing resource
This section describes how to bind a MaxCompute computing resource to a workspace that is part of the Public Preview For The New Data Development (DataStudio).
Go to the computing resources page
Log on to the DataWorks console. In the top navigation bar, select the region where your workspace resides. In the navigation pane on the left, click . From the drop-down list, select the workspace that you want to manage and click Go To Management Center.
In the navigation pane on the left, click Computing Resources to go to the Computing Resources page.
Bind the MaxCompute computing resource
On the Computing Resources page, configure the parameters to bind the MaxCompute computing resource.
Select the computing resource type to bind.
Click Bind Computing Resource or Create Computing Resource to go to the Bind Computing Resource page.
On the Bind Computing Resource page, set the computing resource type to MaxCompute. You are redirected to the Bind MaxCompute Computing Resource configuration page.
Configure the MaxCompute computing resource.
On the Bind MaxCompute Computing Resource configuration page, configure the parameters as described in the following table.
Parameter
Description
MaxCompute Project
Select the MaxCompute project that you want to bind. You can create an internal MaxCompute project or create an external MaxCompute project. After the project is created, select the new project.
NoteIf you create a standard mode workspace, you must select different MaxCompute projects for the development and production environments.
For more information, see Billable items and billing methods of MaxCompute.
If you cannot select the target MaxCompute project, grant the Super_Administrator permission for the project to the current logon account.
Default Access Identity
Specify the identity that is used to access the MaxCompute project in the current workspace.
Development environment: Currently, access is supported only with the Executor identity.
Production environment: Supports access using An Alibaba Cloud Account, A RAM User, or A RAM Role.
NoteOnly Alibaba Cloud accounts and users or roles with the AdministratorAccess permission can select all identity types.
Production data in the current workspace belongs to the default access identity for the production environment that is specified when the computing resource is created. To operate on or access production tables, other accounts must apply for the required permissions in Security Center. For more information, see Control access to MaxCompute data and Overview of Approval Center.
Endpoint
Specify the endpoint that DataWorks uses to access the MaxCompute project through this computing resource. The endpoint includes the endpoint of the MaxCompute service and the endpoint of the Tunnel service that is used to upload or download local data or cloud computing resource data. The following options are supported:
Automatic Adaptation: DataWorks automatically adapts to the actual situation. We recommend that you select this option.
Custom Configuration: If you select this option, you must manually configure the MaxCompute endpoint and Tunnel endpoint. Endpoints vary based on the region.
Computing Resource Instance Name
The name that is used to identify the computing resource. At runtime, the computing resource instance name is used to select the computing resource on which a task runs.
Test the connectivity.
In the Connection Configuration section, select the resource group that DataWorks uses to run MaxCompute tasks and click Test Connectivity to verify that the resource group can access your MaxCompute project. For more information, see Network connection solutions.
NoteIf no resource group is available, you can add and bind a Serverless resource group to the workspace. Then, go to the Computing Resources page of the workspace to test the connectivity to the computing resource.
Click Create to complete the configuration.
NoteAfter the binding is complete, a MaxCompute data source with the same name is automatically created on the Data Source page of the current workspace.
After the computing resource is bound, the platform grants permissions to the specified access identity. This access identity is added to the MaxCompute project and mapped to the corresponding MaxCompute permissions. Before the authorization is complete, the connectivity test may fail and report a permission error. If this occurs, save the computing resource configuration and try again after a few moments.
Legacy Data Development: Bind a MaxCompute computing resource
This section describes how to bind a MaxCompute computing resource to a workspace that is not part of the Public Preview For The New Data Development (DataStudio).
Go to the computing resources page
Go to the DataStudio page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
In the navigation pane on the left, click the
icon to go to the Computing Resources page.
Bind the MaxCompute computing resource
On the Computing Resources page, configure the parameters to bind the MaxCompute computing resource.
Select the computing resource type to bind.
Click Create Computing Resource to go to the Create Computing Resource page.
On the Create Computing Resource page, set the computing resource type to MaxCompute. You are redirected to the Create Computing Resource configuration page.
Configure the MaxCompute computing resource.
On the Create Computing Resource configuration page, configure the parameters as described in the following table.
Parameter
Description
Authentication Method
New computing resources can be authenticated only using an Alibaba Cloud account or a RAM role.
Alibaba Cloud Account
Only MaxCompute projects that belong to the Current Alibaba Cloud Account can be used as computing resources for the current workspace.
MaxCompute Project Name
Select the MaxCompute project that you want to bind. If the target project does not exist, create a MaxCompute project.
NoteIf you create a standard mode workspace, you must select different MaxCompute projects for the development and production environments.
For more information, see Billable items and billing methods of MaxCompute.
If you cannot select the target MaxCompute project, grant the Super_Administrator permission for the project to the current logon account.
Region
Select the region where the MaxCompute project resides. If the selected MaxCompute project is not in the same region as the current workspace, you cannot create the MaxCompute project as a computing resource.
Default Access Identity
Specify the identity that is used to access the computing resource in the current workspace.
Development environment: Only the Executor identity is supported.
Production environment: The Alibaba Cloud account, RAM user, and RAM role identities are supported.
NoteOnly Alibaba Cloud accounts and users or roles with the AdministratorAccess permission can select all identity types.
Production data in the current workspace belongs to the default access identity for the production environment that is specified when the computing resource is created. To operate on or access production tables, other accounts must apply for the required permissions in Security Center. For more information, see Control access to MaxCompute data and Overview of Approval Center.
Endpoint
Specify the endpoint that DataWorks uses to access the MaxCompute project through this computing resource. The endpoint includes the endpoint of the MaxCompute service and the endpoint of the Tunnel service that is used to upload or download local data or cloud computing resource data. The following options are supported:
Automatic Adaptation: DataWorks automatically adapts to the actual situation. We recommend that you select this option.
Custom Configuration: If you select this option, you must manually configure the MaxCompute endpoint and Tunnel endpoint. Endpoints vary based on the region.
Test the connectivity.
In the Connection Configuration section, select the resource group that DataWorks uses to run MaxCompute tasks and click Test Connectivity to verify that the resource group can access your MaxCompute project. For more information, see Network connection solutions.
NoteIf no resource group is available, you can add and bind a Serverless resource group to the workspace. Then, go to the Computing Resources page of the workspace to test the connectivity to the computing resource.
Click Create Computing Resource And Bind To Data Development to complete the configuration.
NoteAfter the binding is complete, a MaxCompute data source with the same name is automatically created on the Data Source page of the current workspace.
After the computing resource is bound, the platform grants permissions to the specified access identity. This access identity is added to the MaxCompute project and mapped to the corresponding MaxCompute permissions. Before the authorization is complete, the connectivity test may fail and report a permission error. If this occurs, save the computing resource configuration and try again after a few moments.
What to do next
After you bind a MaxCompute computing resource, a MaxCompute data source is automatically created for the workspace. You can use this data source in Data Integration, a database node (new Data Development), or a database node (legacy Data Development).
FAQ
Issue: When a MaxCompute computing resource is scheduled, the following error is reported:
connect timed out, the possible reason is that the endpoint `http://service.odps.aliyun.com/api` is wrong, please check your endpoint
.Solution: Verify that the endpoint of the MaxCompute computing resource is correctly configured. You must specify the VPC endpoint for the region where the resource resides.
Issue: When you test the connectivity of the computing resource, the following error is reported:
You have NO privilege 'odps:Read' on {acs:odps:*:projects/xxx}
.Solution: Verify that the status of your MaxCompute project is Normal. If the project is in the Suspended state, go to the MaxCompute console and click Resume for the MaxCompute project that you want to attach.