SQL subqueries enable reuse of the results from a selection query within another query.
CockroachDB supports two kinds of subqueries:
- Relational subqueries, which appear as operands in selection queries and table expressions.
- Scalar subqueries, which appear as operands in a scalar expression.
Data writes in subqueries
When a subquery contains a data-modifying statement (INSERT,DELETE, etc.), the data modification is always executed to completion even if the surrounding query only uses a subset of the result rows.
This is true for subqueries defined using the (...) notation and those defined using WITH.
For example:
> WITH t AS (INSERT INTO t(x) VALUES (1), (2), (3) RETURNING x) SELECT * FROM t LIMIT 1; This query inserts 3 rows into t, even though the surrounding query only observes 1 row using LIMIT.
Correlated subqueries
A subquery is said to be correlated when it uses table or column names defined in the surrounding query.
For example, to find every customer with at least one order, run:
> SELECT c.name FROM customers AS c WHERE EXISTS (SELECT * FROM orders AS o WHERE o.customer_id = c.id); The subquery is correlated because it uses c defined in the surrounding query.
The cost-based optimizer supports most correlated subqueries, with the exception of correlated subqueries that generate side effects inside a CASE statement.
LATERAL subqueries
A LATERAL subquery is a correlated subquery that references another query or subquery in its SELECT statement, usually in the context of a LEFT join or an INNER join. Unlike other correlated subqueries, LATERAL subqueries iterate through each row in the referenced query for each row in the inner subquery, like a for loop.
To create a LATERAL subquery, use the LATERAL keyword directly before the inner subquery's SELECT statement.
For example, the following statement performs an INNER join of the users table and a subquery of the rides table that filters on values in the users table:
> SELECT name, address FROM users, LATERAL (SELECT * FROM rides WHERE rides.start_address = users.address AND city = 'new york'); name | address +------------------+-----------------------------+ Robert Murphy | 99176 Anderson Mills James Hamilton | 73488 Sydney Ports Suite 57 Judy White | 18580 Rosario Ville Apt. 61 Devin Jordan | 81127 Angela Ferry Apt. 8 Catherine Nelson | 1149 Lee Alley Nicole Mcmahon | 11540 Patton Extensions (6 rows) LATERAL subquery joins are especially useful when the join table includes a computed column.
For example, the following query joins a subquery of the rides table with a computed column (adjusted_revenue), and a subquery of the vehicles table that references columns in the rides subquery:
> SELECT ride_id, vehicle_id, type, adjusted_revenue FROM ( SELECT id AS ride_id, vehicle_id, revenue - 0.25*revenue AS adjusted_revenue FROM rides ) AS r JOIN LATERAL ( SELECT type FROM vehicles WHERE city = 'new york' AND vehicles.id = r.vehicle_id AND r.adjusted_revenue > 65 ) ON true; ride_id | vehicle_id | type | adjusted_revenue +--------------------------------------+--------------------------------------+------------+------------------+ 049ba5e3-53f7-4ec0-8000-000000000009 | 11111111-1111-4100-8000-000000000001 | scooter | 71.2500 0624dd2f-1a9f-4e80-8000-00000000000c | 00000000-0000-4000-8000-000000000000 | skateboard | 70.5000 08b43958-1062-4e00-8000-000000000011 | 11111111-1111-4100-8000-000000000001 | scooter | 70.5000 0bc6a7ef-9db2-4d00-8000-000000000017 | 00000000-0000-4000-8000-000000000000 | skateboard | 68.2500 0d4fdf3b-645a-4c80-8000-00000000001a | 00000000-0000-4000-8000-000000000000 | skateboard | 67.5000 1ba5e353-f7ce-4900-8000-000000000036 | 11111111-1111-4100-8000-000000000001 | scooter | 70.5000 (6 rows) In a LATERAL subquery join, the rows returned by the inner subquery are added to the result of the join with the outer query. Without the LATERAL keyword, each subquery is evaluated independently and cannot refer to objects defined in separate queries.
Performance best practices
The results of scalar subqueries are loaded entirely into memory when the execution of the surrounding query starts. To prevent execution errors due to memory exhaustion, ensure that subqueries return as few results as possible.