DEV Community

Cover image for ๐Ÿง  ClickHouse LEFT JOINs: Why join_use_nulls Matters
Shahab Ranjbary
Shahab Ranjbary

Posted on

๐Ÿง  ClickHouse LEFT JOINs: Why join_use_nulls Matters

๐Ÿง  Understanding join_use_nulls in ClickHouse

ClickHouse is famous for being blazing fast โ€” but sometimes its SQL semantics can surprise you, especially around JOINs.

Hereโ€™s a simple example that shows how the join_use_nulls setting can completely change your results.

๐Ÿงฉ The Setup

Letโ€™s create two tiny tables:

CREATE TABLE test.id_val(`id` UInt32, `val` UInt32) ENGINE = TinyLog; INSERT INTO test.id_val VALUES (1,11),(2,12),(3,13); CREATE TABLE test.id_val2(`id` UInt32, `val` UInt32) ENGINE = TinyLog; INSERT INTO test.id_val2 VALUES (1,21),(1,22),(3,23); 
Enter fullscreen mode Exit fullscreen mode

Weโ€™ve got:

id_val: three rows with IDs 1, 2, and 3
id_val2: three rows, but ID 2 is missing and ID 1 appears twice

๐Ÿงฎ The Default JOIN

Letโ€™s run a LEFT JOIN:

SELECT * FROM test.id_val LEFT JOIN test.id_val2 USING (id); 
Enter fullscreen mode Exit fullscreen mode

Output:

โ”Œโ”€idโ”€โ”ฌโ”€valโ”€โ”ฌโ”€val_1โ”€โ” โ”‚ 1 โ”‚ 11 โ”‚ 21 โ”‚ โ”‚ 1 โ”‚ 11 โ”‚ 22 โ”‚ โ”‚ 2 โ”‚ 12 โ”‚ 0 โ”‚ โ”‚ 3 โ”‚ 13 โ”‚ 23 โ”‚ โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 
Enter fullscreen mode Exit fullscreen mode

Notice something interesting?
When thereโ€™s no matching row for id=2, ClickHouse fills it with a default value โ€” in this case, 0.

For numeric columns, the default is 0.
For strings, it would be an empty string ''.

๐Ÿงฉ Enter join_use_nulls

If you want standard SQL behavior, where missing values become NULL, use:

SELECT * FROM test.id_val LEFT JOIN test.id_val2 USING (id) SETTINGS join_use_nulls = 1; 
Enter fullscreen mode Exit fullscreen mode

Now the output looks like this:

โ”Œโ”€idโ”€โ”ฌโ”€valโ”€โ”ฌโ”€val_1โ”€โ” โ”‚ 1 โ”‚ 11 โ”‚ 21 โ”‚ โ”‚ 1 โ”‚ 11 โ”‚ 22 โ”‚ โ”‚ 2 โ”‚ 12 โ”‚ NULL โ”‚ โ”‚ 3 โ”‚ 13 โ”‚ 23 โ”‚ โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 
Enter fullscreen mode Exit fullscreen mode

Thatโ€™s closer to what most SQL engines (like PostgreSQL or MySQL) would do.

โš™๏ธ Why This Matters

This setting affects how ClickHouse represents non-matching rows in joins โ€” which can have downstream effects on:

  • Aggregations
    SELECT count(val_1) FROM ...
    will count 0s differently than NULLs.

  • Exports & BI tools
    Some tools treat 0 as a valid number, but NULL as missing data.

  • Data semantics
    A missing match isnโ€™t always โ€œzeroโ€ โ€” it might mean โ€œunknownโ€.

๐Ÿš€ Performance Note

The default behavior (without join_use_nulls) is slightly faster and more memory-efficient because ClickHouse doesnโ€™t have to track NULL bitmaps.

If youโ€™re doing high-volume analytical joins and donโ€™t care about SQL-standard nulls, keep it off.
If you need correctness and consistency with other systems, turn it on.

๐Ÿ” Check Your Settings

You can check the current value of this setting in your ClickHouse session:

SELECT name, value FROM system.settings WHERE name LIKE '%join_use_nulls%'; 
Enter fullscreen mode Exit fullscreen mode

Or set it globally in your configuration file or session:

SET join_use_nulls = 1; 
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“š Reference

For more detail on ClickHouseโ€™s Join engine and its settings (including join_use_nulls), see: ClickHouse docs โ€” Join table engine

๐Ÿ—จ๏ธ Discussion

Have you ever been surprised by ClickHouse JOIN behavior?
Do you leave join_use_nulls off for performance, or always turn it on for clarity?

Top comments (0)