This topic introduces the MaxCompute Query Accelerator 2.0 (MaxQA) feature and helps you understand its system architecture, application scenarios, limits, and usage methods.
Version guide
The MaxQA feature is in public preview. To participate in the public preview, submit a ticket to request activation. Additionally, join the official MaxQA user support DingTalk group (number: 87535025714), where the MaxCompute technical team will help you address related questions. For more information about the public preview and specific features, see Query Acceleration MaxQA operation guide.
Feature description
With the continuous growth of real-time and near real-time data analysis requirements, query response time has become increasingly important in modern data analysis and business applications. MaxQA (MaxCompute Query Accelerator 2.0, formerly MCQA2.0) is a query acceleration solution launched by Alibaba Cloud MaxCompute to better serve these needs. Based on dedicated query acceleration resource pools, it has been comprehensively optimized across multiple aspects including control paths, query optimizers, execution engines, storage engines, and caching mechanisms, significantly reducing query response time. It is particularly suitable for scenarios that require low and stable latency, such as Business Intelligence (BI), interactive analysis, and near real-time data warehousing.
The MaxCompute MaxQA (formerly MCQA2.0) feature provides the following capabilities:
Supports acceleration optimization for small to medium-sized data query jobs and data insertion jobs (within the TB scale), with the fastest execution time at the sub-second level.
Fully compatible with MaxCompute SQL features, including User Defined Functions (UDF), Delta Table, Delta Live MV incremental materialized view features, and more.
Supports isolated query acceleration resource pools that provide dedicated services for a tenant and ensure high stability.
Supports customized time-sharing resource allocation rules for query acceleration resource pools and batch processing resource pools, along with automatic scaling of interactive quota groups and batch processing quota groups, improving overall resource utilization.
Features end-to-end caching, where jobs automatically cache intermediate and final results from multiple execution stages. Subsequent jobs can hit this cache at any stage, significantly accelerating query execution.
Supports multiple BI tools (FineBI, Tableau, QuickBI).
Service architecture
The core technical advantages of MaxQA include intelligent dynamically isolated resource pools, end-to-end caching mechanisms, localized I/O, latency-optimized execution plans (QueryPlan), and a more efficient execution engine to improve query efficiency.
Intelligent dynamically isolated resource pools: Each MaxQA instance is a completely isolated computing environment. A tenant can create multiple instances (corresponding to multiple interactive quota groups), avoiding interference issues common in multi-tenant environments and ensuring stable query latency.
End-to-end caching mechanism: Tables and metadata scanned by jobs, generated execution plans, intermediate results from multiple stages during execution, and query results are all automatically cached. Subsequent jobs may hit the cache at multiple stages throughout the process, accelerating execution speed. Because it is an instance-level isolated computing environment, the cache has a longer validity period and is not affected by jobs from other instances.
Localized I/O: Maximizes the retention of I/O data from operations such as Shuffle and Spill during source table reading on local storage devices, reducing dependency on external systems and improving latency stability.
Latency-optimized execution plans: Prioritizes latency across multiple dimensions, including physical execution plan selection, concurrency calculation, and compression algorithm selection.
Simplified control path: The frontend directly connects to the coordinator, with optimized control path architecture and asynchronous transformation, improving interaction efficiency.
The MaxQA technical architecture is shown in the following figure.
Scenarios
The MaxQA feature covers various application scenarios from daily operational reports to advanced data analysis, particularly suitable for business scenarios with high requirements for query response time and stability. Whether for short-term decision support or long-term strategic planning, MaxQA provides strong technical support for enterprises, enhancing the value creation capability driven by data.
Scenario | Description | Characteristics | Examples |
Ad hoc query | Flexibly select query conditions based on actual needs, quickly obtain query results, and adjust query logic. This is suitable for data developers or data analysts who want to conduct query analysis using familiar client tools. | • Query latency requirements within seconds or tens of seconds. • Users are typically data developers or analysts with SQL skills. • Flexible selection of query conditions, quick response to changing business requirements. | • Data scientists performing exploratory data analysis. • Data engineers debugging temporary queries in ETL processes. |
Business intelligence (BI) | Use MaxCompute to build enterprise-level data warehouses, and process data through ETL into aggregate data that can be consumed by businesses. Leverage MaxQA's low latency, resource isolation, elastic concurrency, data caching, and other features to meet the requirements for multi-concurrent, fast-response report generation, statistical analysis, and fixed report analysis. | • Query data objects are typically aggregated result data. • Suitable for scenarios with small data volumes, multi-dimensional queries, fixed queries, and high-frequency queries. • High query latency requirements, second-level returns (for example, most queries do not exceed 5 seconds). | • Generating daily sales reports. • Real-time monitoring of key business metrics. • Regular generation of financial statements. |
Interactive data analysis | Self-service BI tools and interactive data exploration platforms make it easy for non-technical users to perform complex data analysis. These tools typically implement dynamic filtering, sorting, aggregation, and other functions through a series of short queries, providing a flexible and intuitive operational experience. | • Supports drag-and-drop operations without the need to write complex SQL statements. • Quickly provides query results feedback to help users iterate through the analysis process. • Suitable for data analysts at various levels, from beginners to experts. | • Using Tableau or Fine BI for visualization analysis. • Data exploration on online data analysis platforms. |
Detailed queries and analysis of large amounts of data | MaxQA can automatically identify the characteristics of query jobs, quickly respond to and process small-scale jobs, and automatically match the resource requirements of large-scale jobs, meeting the needs of analysts to analyze queries of different scales and complexities. | • The amount of historical data to be explored is large, but the actual amount of effective data needed is not large. • Query latency requirements are moderate, between immediacy and batch processing. • Users are typically business analysts who need to explore business patterns from detailed data, discover business opportunities, and verify business hypotheses. | • User behavior path analysis. • Customer segmentation and profile building. • Product usage pattern mining. |
Limits
Only DDL/DML/DQL statements can be executed in MaxQA (such as permission operation statements, Tunnel-related statements, uploading/downloading resources, etc.).
MaxQA supports User-Defined Functions (UDFs). To ensure security, each UDF is launched in an isolated environment. To prevent dramatic performance fluctuations, a maximum of only 50% of resources in a MaxQA instance can be used to run UDFs.
For DQL statements, a maximum of 1 million rows of data are returned by default. You can exceed this limit by setting the
odps.sql.select.auto.limit
parameter to a larger value (it is recommended to set this carefully according to actual business needs, as too large a return value may affect execution efficiency).Jobs that require resident Workers in the execution plan, such as Distributed MapJoin, are not currently supported.
If a MaxQA job fails due to usage limitations, you need to manually retry or try to submit the job to a batch processing quota group.
System parameter description for different CU specifications
Number of CUs | Maximum number of parallel jobs | Job timeout (min) | Max per-job concurrency |
32CU | 32 | 120 min | Number of CUs × 30 |
64CU | 48 | 120 min | Number of CUs × 30 |
96CU | 64 | 120 min | Number of CUs × 30 |
128CU | 80 | 120 min | Number of CUs × 30 |
160CU | 96 | 120 min | Number of CUs × 30 |
192CU | 112 | 120 min | Number of CUs × 30 |
224CU | 128 | 120 min | Number of CUs × 30 |
[256, 1024)CU | 144 | 120 min | Number of CUs × 30 |
[1024, 1536)CU | 288 | 120 min | Number of CUs × 30 |
[1536, 2048)CU | 432 | 180 min | Number of CUs × 30 |
[2048, 2560)CU | 576 | 240 min | Number of CUs × 30 |
[2560, 3072)CU | 720 | 300 min | Number of CUs × 30 |
[3072, 3584)CU | 864 | 360 min | Number of CUs × 30 |
[3584, 4096)CU | 1008 | 420 min | Number of CUs × 30 |
[4096, 4608)CU | 1152 | 480 min | Number of CUs × 30 |
[4608, 5120)CU | 1296 | 540 min | Number of CUs × 30 |
[5120, 5632)CU | 1440 | 600 min | Number of CUs × 30 |
[5632, 6144)CU | 1584 | 660 min | Number of CUs × 30 |
TPC-DS Performance Testing results
Results may vary slightly by region. Actual test results should be used as the standard.
Specification | 10GB | 100GB | 1TB |
64CU | 468s | 672s | 1978s |
128CU | 319s | 418s | 1001s |
The above performance test report was obtained from the test environment in the China (Beijing) region.
For detailed test plans and content, see TPC-DS Performance Testing.
MaxQA vs. MCQA
Comparison item | MCQA | MaxQA (MCQA2.0) |
Architecture | Based on Serverless resource pools. | Single-tenant isolated computing environment. |
Latency stability | Average. | Good. |
Computing performance | Significantly better than offline mode, but stability is insufficient. | Incorporates multiple optimizations, better performance. |
Supported job types | Only supports DQL. | All types of SQL capabilities, including DDL, DQL, and DML. |
Use method | Enable interactive mode. | Specify the name of the interactive quota group when submitting a job. For more information, see MaxQA feature connection methods. |
Quota routing | Supported. | Not currently supported. |
Pay-as-you-go | Supported. | Not currently supported. |
Session concept | Yes. Jobs submitted from the same client in adjacent time periods may belong to one session, with each session corresponding to an Instance ID. | No. Each SQL job corresponds to an Instance ID. |
Fallback mechanism | Has the ability to automatically fall back to batch processing mode. | Does not support automatic fallback. |
Usage method
For specific usage methods of MaxQA, see Query Acceleration MaxQA operation guide.