Create and drop tables - MaxCompute - Alibaba Cloud Documentation Center

This topic describes how to use SQL statements to create and drop tables.

Type

Feature

Role

Execution environments

Create a table

Creates a non-partitioned table, a partitioned table, an external table, or a clustered table.

Users with the CreateTable permission for the project.

You can execute the statements on the following platforms:

Drop a table

Drops a partitioned table or a non-partitioned table.

Users who have the DROP permission on tables

Create a table

Creates a non-partitioned table, a partitioned table, an external table, or a clustered table.

Limits

A partitioned table can have a maximum of six levels of partitions. For example, if a table uses date as a partition key column, the partition levels can be year/month/week/day/hour/minute.
By default, a table can have a maximum of 60,000 partitions. You can adjust the maximum number of partitions in a table as needed.

For more information about the limits on tables, see SQL limits.

Syntax

Create an internal table (including a non-partitioned table and a partitioned table).

CREATE [OR REPLACE] TABLE [IF NOT EXISTS] <table_name> ( <col_name> <data_type>, ... ) [comment <table_comment>] [PARTITIONED BY (<col_name> <data_type> [comment <col_comment>], ...)] [AUTO PARTITIONED BY (<auto_partition_expression> [AS <auto_partition_column_name>]) [TBLPROPERTIES('ingestion_time_partition'='true')] ];

Create a clustered table.

CREATE TABLE [IF NOT EXISTS] <table_name> ( <col_name> <data_type>, ... ) [CLUSTERED BY | RANGE CLUSTERED BY (<col_name> [, <col_name>, ...]) [SORTED BY (<col_name> [ASC | DESC] [, <col_name> [ASC | DESC] ...])] INTO <number_of_buckets> buckets];

Create an external table.

The following example shows how to create an OSS external table using the built-in text data parser. For more information, see ORC external tables.

CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name> ( <col_name> <data_type>, ... ) STORED AS '<file_format>' [WITH SERDEPROPERTIES (options)] LOCATION '<oss_location>';

Create a table and specify the table type.

Create a transactional table. You can update or delete the data of the table after the table is created. However, specific limits are imposed on the transactional table. We recommend that you create a transactional table as needed.
```
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name> ( <col_name <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ... [comment <table_comment>] [TBLPROPERTIES ("transactional"="true")];
```

Create a Delta table. You can use the primary key to perform operations on the table. For example, you can perform upserts, incremental queries, and time travel queries on the table.

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name> ( <col_name <data_type> [NOT NULL] [DEFAULT <default_value>] [comment <col_comment>], ... [PRIMARY KEY (<pk_col_name>[, <pk_col_name2>, ...] )]) [comment <table_comment>] [TBLPROPERTIES ("transactional"="true" [, "write.bucket.num" = "N", "acid.data.retain.hours"="hours"...])] [LIFECYCLE <days>];

Create a table based on an existing table.
- Create a table based on an existing table and replicate data from the existing table to the new table. Partition properties are not replicated. You can create a table based on an existing external table or an existing table of an external project that is used to implement the data lakehouse solution.
```
CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] AS <select_statement>;
```
- Create a table that has the same schema as an existing table. Data in the existing table is not replicated. You can create a table based on an existing external table or an existing table of an external project that is used to implement the data lakehouse solution.
```
CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] LIKE <existing_table_name>;
```

Parameters

Common parameters

Parameter	Required	Description	Remarks
OR REPLACE	No	If the table specified by <table_name> already exists, you can execute the DROP TABLE statement for the table, and then create a table with the same name for replacement.	You can use this parameter instead of the following statements: `DROP TABLE IF EXISTS <tableName>; -- If the destination table exists, drop it first. CREATE TABLE <tableName> ...; -- Create a table.` Note Limits: You cannot use CREATE OR REPLACE TABLE together with the following syntaxes: CREATE TABLE ... IF NOT EXISTS. CREATE TABLE ... AS SELECT. CREATE TABLE ... LIKE.
EXTERNAL	No	Specifies that the table to be created is an external table.	None
IF NOT EXISTS	No	Prevents an error from being thrown if a table with the same name already exists. If the table exists, the statement is ignored.	If you do not specify the IF NOT EXISTS parameter and a table with the same name exists, an error is returned. If you specify the IF NOT EXISTS parameter, a success message is returned regardless of whether the schema of the existing table is the same as that of the table to be created. The metadata of the existing table remains unchanged.
table_name	Yes	The name of the table.	The name is case-insensitive and can only contain letters, digits, and underscores (_). It is recommended that the name starts with a letter. The name cannot exceed 128 bytes in length; otherwise, an error is returned.
PRIMARY KEY(pk)	No	The primary key of the table.	You can specify one or more columns as the primary key. This indicates that the combination of these columns must be unique in the table. You must comply with the standard SQL syntax for primary keys. The columns that are defined as the primary key must be set to not null and cannot be modified. Important This parameter is applicable only to Delta tables.
col_name	Yes	The name of a table column.	The name is case-insensitive and can only contain letters, digits, and underscores (_). It is recommended that the name starts with a letter. The name cannot exceed 128 bytes in length; otherwise, an error is returned.
col_comment	No	The comment of the column.	The comment must be a valid string that does not exceed 1,024 bytes in length. If the value of this parameter does not meet the requirements, an error is returned.
data_type	Yes	The data type of the column.	The data type can be BIGINT, DOUBLE, BOOLEAN, DATETIME, DECIMAL, STRING, or another data type. For more information, see Data type editions.
NOT NULL	No	If you specify this parameter for a column, the values of the column cannot be NULL.	For more information about how to modify the NOT NULL attribute, see Partition operations.
default_value	No	The default value for the column.	If the `insert` operation does not specify a value for the column, the default value is written to the column. Note Functions such as `GETDATE()` and `NOW()` are not supported as default values.
table_comment	No	The comment of the table.	The comment must be a valid string that does not exceed 1,024 bytes in length. If the value of this parameter does not meet the requirements, an error is returned.
LIFECYCLE	No	The lifecycle of the table.	The value must be a positive integer. Unit: days. Non-partitioned tables: The lifecycle is calculated from the last time when the table data is modified. If the table data remains unchanged after the specified number of days, MaxCompute automatically reclaims the table (similar to the DROP TABLE operation). Partitioned tables: MaxCompute determines whether to reclaim a partition based on the value of LastModifiedTime. Unlike non-partitioned tables, a partitioned table is not deleted even if all of its partitions have been reclaimed. You can configure lifecycles for tables, but not for partitions.

Parameters for partitioned tables

MaxCompute supports two types of partitioned tables: regular partitioned tables and auto-partitioned tables (AUTO PARTITION). You can select a type based on how you want partition key columns to be generated in different scenarios.

Parameters for regular partitioned tables

Parameter	Required	Description	Remarks
PARTITIONED BY	Yes	Specifies the partition of a regular partitioned table.	You can specify a partition using only one of the following methods: PARTITIONED BY or AUTO PARTITIONED BY.
col_name	Yes	The name of the partition key column.	The name is case-insensitive and can only contain letters, digits, and underscores (_). It is recommended that the name starts with a letter. The name cannot exceed 128 bytes in length; otherwise, an error is returned.
data_type	Yes	The data type of the partition key column.	In the MaxCompute V1.0 data type edition, partition key columns must be of the STRING type. In the MaxCompute V2.0 data type edition, partition key columns can be of the TINYINT, SMALLINT, INT, BIGINT, VARCHAR, or STRING type. For more information, see Data type editions. If you use a partition field to partition a table, a full table scan is not required when you add partitions, update partition data, or read partition data. This improves the efficiency of data processing.
col_comment	No	The comment of the partition key column.	The comment must be a valid string that does not exceed 1,024 bytes in length. If the value of this parameter does not meet the requirements, an error is returned.

Note

The value of a partition key column cannot contain double-byte characters, such as Chinese characters. The value of a partition key column must start with a letter and can contain letters, digits, and supported special characters. The length cannot exceed 255 bytes. Supported special characters include spaces, colons (:), underscores (_), dollar signs ($), number signs (#), periods (.), exclamation points (!), and at signs (@). The behavior of other characters is undefined, such as escape characters \t, \n, and /.

Parameters for AUTO PARTITION tables

Auto-partitioned tables (AUTO PARTITION) support automatic generation of partition key columns in a specific manner.

Parameter	Required	Description	Remarks
AUTO PARTITIONED BY	Yes	Specifies the partition of an AUTO PARTITION table.	You can specify a partition using only one of the following methods: PARTITIONED BY or AUTO PARTITIONED BY.
auto_partition_expression	Yes	Defines how to calculate partition key columns. Currently, only the TRUNC_TIME function is supported to generate partition key columns, and only one partition key column is supported.	The TRUNC_TIME function supports truncation of time or date columns in a table based on a specified time unit to generate partition key columns.
auto_partition_column_name	No	The name of the generated partition key column. If you do not specify a name for the partition key column, the system uses `_pt_col_0_` as the default name. The system checks whether the name exists in the table. If the name exists, the system automatically increments the suffix (such as `_pt_col_1_` or `_pt_col_2_`) until an unused name is found.	After the partition is calculated using the partition expression, a partition key column of the STRING type is generated based on the calculation result. You can explicitly specify the column name. The column type and column values cannot be directly operated.
TBLPROPERTIES('ingestion_time_partition'='true')	No	Specifies whether to generate partition key columns based on the data write time.	For more information about how to use auto-partitioned tables based on the data write time, see Auto-partitioned tables based on the data write time.

Parameters for clustered tables

Clustered tables are classified into hash-clustered tables and range-clustered tables.

Parameters for hash-clustered tables

Parameter	Required	Description	Remarks
CLUSTERED BY	Yes	The hash key.	MaxCompute performs a hash operation on the specified columns and distributes data to each bucket based on the hash values. To prevent data skew and hot spots and to achieve better parallel execution performance, for the `CLUSTERED BY` clause, select columns that have a wide range of values and few duplicate key values. Additionally, to optimize `join` operations, you should also consider selecting commonly used Join or Aggregation Keys, which are similar to primary keys in traditional databases.
SORTED BY	Yes	Specifies how to sort fields in a bucket.	We recommend that you set SORTED BY to the same value as CLUSTERED BY to achieve better performance. In addition, when you specify the SORTED BY clause, MaxCompute automatically generates an index and uses the index to accelerate execution during queries.
number_of_buckets	Yes	The number of hash buckets.	This parameter is required and the value of this parameter varies based on the amount of data. By default, MaxCompute supports a maximum of 1,111 reducers. This means that MaxCompute supports a maximum of 1,111 hash buckets. You can use `set odps.stage.reducer.num =<concurrency>;` to increase this limit. However, the maximum value is 4,000. Otherwise, performance is affected.

Note

To maintain optimal performance, we recommend that you take note of the following rules when you specify the number of hash buckets:

Keep the size of each hash bucket around 500 MB. For example, if the partition size is 500 GB, we recommend that you specify 1,000 buckets. This way, the size of each bucket is 500 MB on average. If a table contains a large amount of data, you can increase the size of each bucket from 500 MB to a size in the range of 2 GB to 3 GB. You can also use set odps.stage.reducer.num=<concurrency>; to exceed the limit of 1,111 buckets.
For join optimization scenarios, removing the shuffle and sort steps can significantly improve performance. Therefore, the number of hash buckets of a table must be a multiple of the number of hash buckets of the other table. For example, one table has 256 hash buckets and the other table has 512 hash buckets. We recommend that you use 2ⁿ as the number of hash buckets, such as 512, 1,024, 2,048, or 4,096. This way, the system can automatically split and merge hash buckets and remove the shuffle and sort steps to improve execution efficiency.

Parameters for range-clustered tables

Parameter	Required	Description	Remarks
RANGE CLUSTERED BY	Yes	The range-clustered columns.	MaxCompute performs the bucket operation on the specified columns and distributes data to each bucket based on the bucket ID.
SORTED BY	Yes	Specifies the sort order of columns within each bucket.	You can use this parameter in the same way as you use it for a hash-clustered table.
number_of_buckets	Yes	The number of hash buckets.	The number of buckets in a range-clustered table does not need to follow the 2ⁿ best practice for hash-clustered tables. If the data distribution is reasonable, any number of buckets can be used. If you do not specify the number of buckets in a range-clustered table, MaxCompute automatically determines the optimal number based on the amount of data.

If JOIN and AGGREGATE operations are performed on range-clustered tables and the join key or group key is the range-clustered key or the prefix of the range-clustered key, you can manage flags to disable shuffling. This improves execution efficiency. You can use set odps.optimizer.enable.range.partial.repartitioning=true/false; to configure this feature. This feature is disabled by default.

Note

Clustered tables help optimize the following aspects:
- Bucket pruning
- Aggregation optimization.
- Storage optimization.
Clustered tables have the following limits:
- INSERT INTO is not supported. You can use only INSERT OVERWRITE to add data.
- The data that is imported using Tunnel commands is not arranged in order. Therefore, you cannot import data to a range-clustered table using Tunnel commands.
- The data backup and restoration feature is not supported.

Parameters for external tables

The following example shows the parameters for creating an OSS external table. For more information about how to create external tables, see External tables.

Parameter	Required	Description
`STORED AS '<file_format>'`	Yes	Specifies file_format based on the data format of the external table.
`WITH SERDEPROPERTIES(options)`	No	The parameters related to the authorization, compression, and character parsing of the external table.
oss_location	Yes	The OSS storage location of the external table data. For more information, see OSS external tables.

Parameters for Transaction Table and Delta Table

Parameters for Delta tables

Delta tables support capabilities such as near real-time reads and writes, incremental reads and writes, incremental storage, and real-time updates. Only Delta tables that have a primary key are supported.

Parameter	Required	Description	Remarks
PRIMARY KEY(PK)	Yes	This parameter is required when you create a Delta table. You can specify multiple columns as the primary key.	You must comply with the standard SQL syntax for primary keys. The columns that are defined as the primary key must be set to not null and cannot be modified. After you specify a primary key for a Delta table, duplicate data is removed from the table based on the primary key. The uniqueness constraint for the primary key column is valid in a single partition or in a non-partitioned table.
transactional	Yes	This parameter is required when you create a Delta table. You must set this parameter to true.	Setting this property to `true` indicates that the table supports ACID (atomicity, consistency, isolation, and durability) transactions. It uses the Multi-Version Concurrency Control (MVCC) model to provide snapshot isolation.
write.bucket.num	No	Default value: 16. Valid values: `(0, 4096]`.	This parameter indicates the number of buckets in a partition of a partitioned table or in a non-partition table. This parameter also specifies the number of concurrent nodes that are used to write data. You can change the value of this parameter for a partitioned table. If new partitions are added to a partitioned table, the configuration of this parameter takes effect on the new partitions by default. You cannot change the value of this parameter for a non-partitioned table. Take note of the following points: If data is written using a MaxCompute tunnel, the value of this parameter specifies the number of concurrent nodes that are used to write data. The setting of the parameter affects the import traffic and is also subject to the maximum number of concurrent nodes in the tunnel. If data is written using an SQL statement, the value of this parameter specifies the concurrency of the reducers that are used to write data. The setting is subject to the maximum number of concurrent reducer nodes. We recommend that you write approximately 500 MB of data to each bucket. For example, if the partition size is 500 GB, we recommend that you add 1,000 hash buckets. This way, the size of each hash bucket is 500 MB on average. If a table contains a large amount of data, you can increase the size of each hash bucket from 500 MB to a size in the range of 2 GB to 3 GB.
acid.data.retain.hours	No	Default value: 24. Valid values: `[24, 168]`.	The time range during which the historical data status can be queried using the time travel feature. Unit: hours. To query the historical data status for a period more than 168 hours (7 days), contact MaxCompute technical support. If you set this parameter to 0, the historical data status is not retained, and time travel queries are not supported. If the historical data status is retained for a period of time that is not in the range that is specified by this parameter, the data can be deleted. You can use the compact method to reclaim the space that is occupied by the data. If you perform an SQL time travel query on data that is generated earlier than the time range specified by this parameter, an error is returned. For example, if the value of this parameter is 72, and the time travel query is performed to query the historical data status 72 hours ago, an error is returned.
acid.incremental.query.out.of.time.range.enabled	No	Default value: `false`.	If you set this parameter to true, the value of the endTimestamp property specified by an incremental query can be a point in time that is later than the maximum commit time of data in a table. If the value of the endTimestamp property is greater than the current time, new data may be inserted into a Delta table, and you may obtain different results for multiple queries. You can change the value of this parameter for a table.
acid.write.precombine.field	No	You can use this parameter to specify the name of only one column.	If you specify a column name, the system performs data deduplication based on the primary key column in the file that contains the SQL statement committed together with this parameter. This ensures data uniqueness and consistency. Note If the size of data that is committed at a time exceeds 128 MB, multiple files are generated. This parameter cannot be used for data deduplication of multiple files.
acid.partial.fields.update.enable	No	If you set this parameter to `true`, you can use SQL or Tunnel to perform partial column updates on a Delta table.	You can set this parameter when you create a table. After the table is created, you cannot change the value of this parameter.

Requirements for other general parameters of Delta tables:
- LIFECYCLE: The lifecycle of the table must be greater than or equal to the lifecycle during which time travel queries can be performed. The formula is lifecycle >= acid.data.retain.hours / 24. When you create a table, MaxCompute checks the lifecycle of the table that you specified. If the specified lifecycle does not meet the requirements, an error is returned.
- Other unsupported features: Delta tables do not support the CLUSTER BY and CREATE TABLE AS statements and cannot be used as external tables.
Other limits:
- Only MaxCompute SQL can directly perform operations on Delta tables.
- You cannot change existing common tables to Delta tables.
- You cannot change the schema of the primary key column of a Delta table.

Parameters for Transaction Table

Parameter	Required	Description
TBLPROPERTIES(transactional"="true")	Yes	This parameter is used to set the table to a transactional table. You can perform `update` and `delete` operations on the transactional table to update or delete data at the row level. For more information, see Update or delete data (UPDATE \| DELETE).

A transactional table has the following limits:

MaxCompute lets you set a table to a transactional table only when you create the table. You cannot use the ALTER TABLE statement to change an existing table to a transactional table. If you execute the following statement, an error is returned:
```
ALTER TABLE not_txn_tbl SET TBLPROPERTIES("transactional"="true"); --An error is returned. FAILED: Catalog Service Failed, ErrorCode: 151, Error Message: Set transactional is not supported
```
When you create a clustered table or an external table, you cannot set it to a transactional table.
You cannot convert between transactional tables and MaxCompute internal tables, external tables, or clustered tables.
Transactional table files cannot be automatically merged. You must manually execute the merge operation. For more information, see Merge transactional table files.
The merge partition operation is not supported.
Specific limits are imposed on access to transactional tables from jobs of other systems. For example, if your job is a Graph job, you cannot use the job to read data from or write data to a transactional table. If your job is a Spark job or a Platform for AI (PAI) job, you can use the job to only read data from a transactional table and you cannot use the job to write data to a transactional table.
Before you perform update, delete, or insert overwrite operations on important data in a transactional table, you must manually back up the data to another table using the select+insert operations.

Parameters for creating tables based on existing tables

You can use the CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] AS <select_statement>; statement to create a table and replicate data to the new table.
- However, partition properties and the lifecycle property of the source table are not replicated to the created table. The partition key columns of the source table are considered common columns in the created table.
- You can also use the lifecycle parameter to reclaim the table. You can also execute this statement to create an internal table and replicate data of an external table to the internal table.
You can use the CREATE TABLE [IF NOT EXISTS] <table_name> [LIFECYCLE <days>] LIKE <existing_table_name>; statement to create a table that uses the same schema as an existing table.
- However, tables created using this statement do not replicate data or the lifecycle property of the source table.
- You can also use the lifecycle parameter to reclaim the table. You can also execute this statement to create an internal table that has the same schema as an existing external table.

Examples

Create a non-partitioned table

Create a non-partitioned table
```
CREATE TABLE test1 (key STRING);
```

Create a non-partitioned table and configure default values for table fields.

CREATE TABLE test_default( tinyint_name tinyint NOT NULL default 1Y, smallint_name SMALLINT NOT NULL DEFAULT 1S, int_name INT NOT NULL DEFAULT 1, bigint_name BIGINT NOT NULL DEFAULT 1, binary_name BINARY , float_name FLOAT , double_name DOUBLE NOT NULL DEFAULT 0.1, decimal_name DECIMAL(2, 1) NOT NULL DEFAULT 0.0BD, varchar_name VARCHAR(10) , char_name CHAR(2) , string_name STRING NOT NULL DEFAULT 'N', boolean_name BOOLEAN NOT NULL DEFAULT TRUE );

Create a partitioned table

Create a regular partitioned table.

CREATE TABLE IF NOT EXISTS sale_detail( shop_name STRING, customer_id STRING, total_price DOUBLE) PARTITIONED BY (sale_date STRING, region STRING);

Create an AUTO PARTITION table and use a time calculation function to generate partitions based on the time-type data column.

-- Truncate the sale_date column by month to generate a partition column named sale_month, and partition the table based on this partition column. CREATE TABLE IF NOT EXISTS auto_sale_detail( shop_name STRING, customer_id STRING, total_price DOUBLE, sale_date DATE ) AUTO PARTITIONED BY (TRUNC_TIME(sale_date, 'month') AS sale_month);

Create an AUTO PARTITION table and specify that partitions are generated based on the data write time. In this case, the system automatically obtains the time when data is written to MaxCompute and generates partitions using the time calculation function.

-- After the table is created, when data is written to the table, the system automatically obtains the data write time _partitiontime, truncates it by day, generates a partition column named sale_date, and partitions the table based on this partition column. CREATE TABLE IF NOT EXISTS auto_sale_detail2( shop_name STRING, customer_id STRING, total_price DOUBLE, _partitiontime TIMESTAMP_NTZ) AUTO PARTITIONED BY (TRUNC_TIME(_partitiontime, 'day') AS sale_date) TBLPROPERTIES('ingestion_time_partition'='true');

Create a new table

Create a table to replace an existing table.

Create the original table mytable and write data to the table.

CREATE OR REPLACE TABLE my_table(a BIGINT); INSERT INTO my_table(a) VALUES (1),(2),(3);

Use OR REPLACE to create a table with the same name and modify the table fields.
```
CREATE OR REPLACE TABLE my_table(b STRING);
```
Query the my_table table. The following code shows the returned result:
```
+------------+ | b | +------------+ +------------+
```

The following SQL statements are invalid:

CREATE OR REPLACE TABLE IF NOT EXISTS my_table(b STRING); CREATE OR REPLACE TABLE my_table AS SELECT; CREATE OR REPLACE TABLE my_table LIKE newtable;

Create a table, replicate data from an existing table to the new table, and then configure the lifecycle for the new table.
```
-- Create a table named sale_detail_ctas1, replicate data from the sale_detail table to the sale_detail_ctas1 table, and configure the lifecycle. SET odps.sql.allow.fullscan=true; CREATE TABLE sale_detail_ctas1 LIFECYCLE 10 AS SELECT * FROM sale_detail;
```
You can execute the DESC EXTENDED sale_detail_ctas1; command to view the table structure, lifecycle, and other details.
Note
The source table sale_detail is a partitioned table. However, when the new sale_detail_ctas1 table is created using the CREATE TABLE ... AS select_statement ... statement, partition properties are not replicated. The partition key columns of the source table are treated as common columns. Therefore, sale_detail_ctas1 is a non-partitioned table with five columns.
Create a table and use constants as column values in the SELECT clause.
Note
If you use constants as column values in the SELECT clause, we recommend that you specify column names. The fourth and fifth columns of the created sale_detail_ctas3 table are similar to _c4 and _c5.
- Column names are specified.
```
SET odps.sql.allow.fullscan=true; CREATE TABLE sale_detail_ctas2 AS SELECT shop_name, customer_id, total_price, '2013' AS sale_date, 'China' AS region FROM sale_detail;
```
- Column names are not specified.
```
SET odps.sql.allow.fullscan=true; CREATE TABLE sale_detail_ctas3 AS SELECT shop_name, customer_id, total_price, '2013', 'China' FROM sale_detail;
```
Create a table that uses the same schema as an existing table and configure the lifecycle for the new table.
```
CREATE TABLE sale_detail_like LIKE sale_detail LIFECYCLE 10;
```
You can execute the DESC EXTENDED sale_detail_like; command to view the table structure, lifecycle, and other details.
Note
The schema of the sale_detail_like table is the same as that of the sale_detail table. The two tables have the same properties, such as column names, column comments, and table comments, aside from the lifecycle. However, data in the sale_detail table is not replicated to the sale_detail_like table.

Create a table that has the same schema as an external table.

-- Create a table named mc_oss_extable_orc_like that has the same schema as the external table mc_oss_extable_orc. CREATE TABLE mc_oss_extable_orc_like LIKE mc_oss_extable_orc;

You can execute the DESC mc_oss_extable_orc_like; command to view the table structure and other details.

+------------------------------------------------------------------------------------+ | Owner: ALIYUN$****@***.aliyunid.com | Project: max_compute_7u************yoq | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2022-08-11 11:10:47 | | LastDDLTime: 2022-08-11 11:10:47 | | LastModifiedTime: 2022-08-11 11:10:47 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | id | string | | | | name | string | | | +------------------------------------------------------------------------------------+

Create a table that uses new data types

SET odps.sql.type.system.odps2=true; CREATE TABLE test_newtype ( c1 TINYINT, c2 SMALLINT, c3 INT, c4 BIGINT, c5 FLOAT, c6 DOUBLE, c7 DECIMAL, c8 BINARY, c9 TIMESTAMP, c10 ARRAY<MAP<BIGINT,BIGINT>>, c11 MAP<STRING,ARRAY<BIGINT>>, c12 STRUCT<s1:STRING,s2:BIGINT>, c13 VARCHAR(20)) LIFECYCLE 1;

Create a hash-clustered or range-clustered table

Create a hash-clustered non-partitioned table.

CREATE TABLE t1 (a STRING, b STRING, c BIGINT) CLUSTERED BY (c) SORTED BY (c) INTO 1024 buckets;

Create a hash-clustered partitioned table.

CREATE TABLE t2 (a STRING, b STRING, c BIGINT) PARTITIONED BY (dt STRING) CLUSTERED BY (c) SORTED BY (c) INTO 1024 buckets;

Create a range-clustered non-partitioned table.

CREATE TABLE t3 (a STRING, b STRING, c BIGINT) RANGE CLUSTERED BY (c) SORTED BY (c) INTO 1024 buckets;

Create a range-clustered partitioned table.

CREATE TABLE t4 (a STRING, b STRING, c BIGINT) PARTITIONED BY (dt STRING) RANGE CLUSTERED BY (c) SORTED BY (c);

Create a transactional table

Creates a transactional non-partitioned table.

CREATE TABLE t5(id BIGINT) TBLPROPERTIES ("transactional"="true");

Creates a transactional partitioned table.

CREATE TABLE IF NOT EXISTS t6(id BIGINT) PARTITIONED BY (ds STRING) TBLPROPERTIES ("transactional"="true");

Create an internal table

Create an internal table and replicate data of a partitioned external table to the internal table. Partition properties are not replicated to the internal table.

Create an OSS external table and a MaxCompute internal table.

-- Create an OSS external table and insert data. CREATE EXTERNAL TABLE max_oss_test(a INT, b INT, c INT) STORED AS TEXTFILE LOCATION "oss://oss-cn-hangzhou-internal.aliyuncs.com/<bucket_name>"; INSERT INTO max_oss_test VALUES (101, 1, 20241108), (102, 2, 20241109), (103, 3, 20241110); SELECT * FROM max_oss_test; -- Result a b c 101 1 20241108 102 2 20241109 103 3 20241110 -- Execute the CREATE TABLE AS statement to create an internal table. CREATE TABLE from_exetbl_oss AS SELECT * FROM max_oss_test; -- Query the internal table. SELECT * FROM from_exetbl_oss; -- All data in the internal table is returned. a b c 101 1 20241108 102 2 20241109 103 3 20241110

Execute DESC from_exetbl_as_par； to query the structure of the internal table. The following information is returned:

+------------------------------------------------------------------------------------+ | Owner: ALIYUN$*********** | | Project: ***_*****_*** | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2023-01-10 15:16:33 | | LastDDLTime: 2023-01-10 15:16:33 | | LastModifiedTime: 2023-01-10 15:16:33 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 919 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | a | string | | | | b | string | | | | c | string | | | +------------------------------------------------------------------------------------+

Create an internal table and replicate data of a partitioned external table to the internal table. Partition properties are replicated to the internal table.

Create an internal table named from_exetbl_like.

-- Query the data lakehouse external table from MaxCompute. SELECT * FROM max_oss_test; -- Result a b c 101 1 20241108 102 2 20241109 103 3 20241110 -- Execute the CREATE TABLE LIKE statement to create an internal table. CREATE TABLE from_exetbl_like LIKE max_oss_test; -- Query the internal table. SELECT * FROM from_exetbl_like; -- Only the schema of the internal table is returned. a b c

Execute DESC from_exetbl_like； to query the structure of the internal table. The following information is returned:

+------------------------------------------------------------------------------------+ | Owner: ALIYUN$************ | | Project: ***_*****_*** | | TableComment: | +------------------------------------------------------------------------------------+ | CreateTime: 2023-01-10 15:09:47 | | LastDDLTime: 2023-01-10 15:09:47 | | LastModifiedTime: 2023-01-10 15:09:47 | +------------------------------------------------------------------------------------+ | InternalTable: YES | Size: 0 | +------------------------------------------------------------------------------------+ | Native Columns: | +------------------------------------------------------------------------------------+ | Field | Type | Label | Comment | +------------------------------------------------------------------------------------+ | a | string | | | | b | string | | | +------------------------------------------------------------------------------------+ | Partition Columns: | +------------------------------------------------------------------------------------+ | c | string | | +------------------------------------------------------------------------------------+

Create a Delta table

Create a Delta table.

CREATE TABLE mf_tt (pk BIGINT NOT NULL PRIMARY KEY, val BIGINT) TBLPROPERTIES ("transactional"="true");

Create a Delta table and configure the main table properties.

CREATE TABLE mf_tt2 ( pk BIGINT NOT NULL, pk2 BIGINT NOT NULL, val BIGINT, val2 BIGINT, PRIMARY KEY (pk, pk2) ) TBLPROPERTIES ( "transactional"="true", "write.bucket.num" = "64", "acid.data.retain.hours"="120" ) LIFECYCLE 7;

Drop a table

Drops a non-partitioned table or a partitioned table.

Precautions

Before you drop a table, confirm that the table can be dropped. Proceed with caution. If you accidentally drop a table, you can restore the table if the backup and restoration feature is enabled for the project and the table is dropped within the backup data retention period specified for the project. For more information about backup and restoration, see Backup.
After you drop a table, the volume of stored data in a MaxCompute project decreases.

Syntax

DROP TABLE [IF EXISTS] <table_name>;

Parameters

Parameter	Required	Description
IF EXISTS	No	If you do not specify the IF EXISTS parameter and the table does not exist, an error is returned. If you specify the IF EXISTS parameter, a success message is returned regardless of whether the table exists.
table_name	Yes	The name of the table that you want to drop.

Examples

--Drop the sale_detail table. A success message is returned regardless of whether the sale_detail table exists. DROP TABLE IF EXISTS sale_detail;

References

For more information about how to replicate table data to another table, see CLONE TABLE.
For more information about how to modify and view table information, see Modify and view tables.
For more information about how to perform operations on partitions of an existing table, see Partition operations.
For more information about how to perform operations on columns of an existing table, see Column operations.