Window Functions
Window functions perform calculations across sets of table rows that are related to the current row. Unlike aggregate functions that return a single result for a group of rows, window functions return a value for every row while considering a window of rows defined by the OVER clause.
For details about window functions syntax and components, please visit the OVER Keyword reference
avg()
In the context of window functions, avg(value) calculates the average of value over the set of rows defined by the window frame.
Arguments:
value: The column of numeric values to calculate the average of.
Return value:
- The average of
valuefor the rows in the window frame.
Description
When used as a window function, avg() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY, ORDER BY, and frame specification components of the OVER clause.
The avg() function respects the frame clause, meaning it only includes rows within the specified frame in the calculation. The result is a separate average for each row, based on the corresponding window of rows.
Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.
Syntax:
avg(value) OVER (window_definition)
Example:
SELECT
symbol,
price,
timestamp,
avg(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS moving_avg
FROM trades
WHERE timestamp in today();
count()
Counts rows or non-null values over the window frame.
Syntax:
count(*) OVER (window_definition)
count(value) OVER (window_definition)
Arguments:
*: Counts all rowsvalue: Counts non-null values
Example:
SELECT
symbol,
count(*) OVER (
PARTITION BY symbol
ORDER BY timestamp
RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW
) AS trades_last_second
FROM trades;
dense_rank()
In the context of window functions, dense_rank() assigns a unique rank to each row within the window frame. Rows with equal values may have the same rank, but there are no gaps in the rank numbers - it increases sequentially.
Arguments:
dense_rank()does not require arguments.
Return value:
- The increasing consecutive rank numbers of each row within the window frame. Return value type is
long.
Description
When used as a window function, dense_rank() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY and ORDER BY components of the OVER clause.
The dense_rank() function assigns a unique rank to each row within its window, with the same rank for the same values in the ORDER BY clause of the OVER clause. However, there are no gaps in the counter, unlike with rank() - it is guaranteed to be sequential. It ignores the frame clause, meaning it considers all rows in each partition, regardless of the frame specification.
Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.
Syntax:
dense_rank() OVER (window_definition)
Example:
SELECT
symbol,
price,
timestamp,
dense_rank() OVER (
PARTITION BY symbol
ORDER BY price DESC
) AS price_rank
FROM trades
WHERE timestamp in today();
first_value()
In the context of window functions, first_value(value) calculates the first value in the set of rows defined by the window frame.
Arguments:
value: Any numeric value.
Return value:
- The first occurrence of
value(including null) for the rows in the window frame.
Description
first_value() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY, ORDER BY, and frame specification components of the OVER clause.
The first_value() function respects the frame clause, meaning it only includes rows within the specified frame in the calculation. The result is a separate value for each row, based on the corresponding window of rows.
Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.
Syntax:
first_value(value) OVER (window_definition)
Example:
SELECT
symbol,
price,
timestamp,
first_value(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS first_price
FROM trades
WHERE timestamp in today();
max()
In the context of window functions, max(value) calculates the maximum value within the set of rows defined by the window frame.
Arguments:
value: Any numeric value.
Return value:
- The maximum value (excluding null) for the rows in the window frame.
Description
When used as a window function, max() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY, ORDER BY, and frame specification components of the OVER clause.
The max() function respects the frame clause, meaning it only includes rows within the specified frame in the calculation. The result is a separate value for each row, based on the corresponding window of rows.
Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.
Syntax:
max(value) OVER (window_definition)
Example:
SELECT
symbol,
price,
timestamp,
max(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS highest_price
FROM trades
WHERE timestamp in today();
min()
In the context of window functions, min(value) calculates the minimum value within the set of rows defined by the window frame.
Arguments:
value: Any numeric value.
Return value:
- The minimum value (excluding null) for the rows in the window frame.
Description
When used as a window function, min() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY, ORDER BY, and frame specification components of the OVER clause.
The min() function respects the frame clause, meaning it only includes rows within the specified frame in the calculation. The result is a separate value for each row, based on the corresponding window of rows.
Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.
Syntax:
min(value) OVER (window_definition)
Example:
SELECT
symbol,
price,
timestamp,
min(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS lowest_price
FROM trades
WHERE timestamp in today();
lag()
In the context of window functions, lag() accesses data from previous rows in the result set without using a self-join. For each row, lag() returns the value from a row at a specified offset before the current row within the partition.
The lag() function provides access to a row at a given physical offset that precedes the current row, returning NULL if the offset goes beyond the bounds of the window or partition (unless a default is specified).
- When
offsetis 0, returns the current row value IGNORE NULLSmakes the function act as if NULL value rows don't existRESPECT NULLS(default) includes NULL values in offset counting- Does not support ROWS/RANGE frame clauses (silently ignored if present)
- When ORDER BY is not provided, uses table scan order
Arguments:
value: The column or expression to get the value fromoffset(optional): The number of rows backward from the current row. Default is 1default(optional): The value to return when the offset goes beyond the partition bounds. Default is NULL[IGNORE | RESPECT] NULLS(optional): Determines whether NULL values should be ignored. Default is RESPECT NULLS
Return value:
- The value from the row at the specified offset before the current row
Syntax:
lag(value [, offset [, default]]) [(IGNORE|RESPECT) NULLS]
OVER ([PARTITION BY partition_expression] [ORDER BY sort_expression])
Example:
SELECT
timestamp,
price,
lag(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS previous_price,
lag(price, 2, 0.0) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS price_two_rows_back
FROM trades
WHERE timestamp in today();
This example:
- Gets the previous price for each symbol (
previous_price) - Gets the price from 2 rows back (
price_two_rows_back) - Uses 0.0 as default when looking 2 rows back reaches the partition start
last_value()
In the context of window functions, last_value() returns the last value in a window frame. The function supports both regular and NULL-aware processing through the IGNORE NULLS clause.
The last_value() function provides access to the last value within a window frame. The behavior depends on:
- Window frame definition (
ROWS/RANGE) - Presence of
ORDER BYandPARTITION BYclauses IGNORE/RESPECT NULLSsetting
In addition, note the following:
- When no
ORDER BYis provided, uses table scan order - Supports both
ROWSandRANGEframe specifications - When neither
ORDER BYnorROWS/RANGEis specified, the default frame becomesROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING - When
ORDER BYis provided butROWS/RANGEis not, the default frame becomesROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
Arguments:
value: The column or expression to get the value from[IGNORE | RESPECT] NULLS(optional): Determines whether NULL values should be ignored. Default isRESPECT NULLS
Return value:
- The last non-NULL value in the window frame when using
IGNORE NULLS - The last value (including NULL) in the window frame when using
RESPECT NULLS
Syntax:
last_value(value) [(IGNORE|RESPECT) NULLS]
OVER ([PARTITION BY partition_expression]
[ORDER BY sort_expression]
[frame_clause])
Example:
SELECT
timestamp,
price,
last_value(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS last_price,
last_value(price) IGNORE NULLS OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS last_non_null_price
FROM trades
WHERE timestamp in today();
This example:
- Gets the last price within a 3-row window for each symbol (
last_price) - Gets the last non-NULL price for each symbol (
last_non_null_price) - Demonstrates both
RESPECT NULLS(default) andIGNORE NULLSbehavior
lead()
In the context of window functions, lead() accesses data from subsequent rows in the result set without using a self-join. For each row, lead() returns the value from a row at a specified offset following the current row within the partition.
The lead() function provides access to a row at a given physical offset that follows the current row, returning NULL if the offset goes beyond the bounds of the window or partition (unless a default is specified).
- When
offsetis 0, returns the current row value IGNORE NULLSmakes the function act as ifNULLvalue rows don't existRESPECT NULLS(default) includesNULLvalues in offset counting- Does not support
ROWS/RANGEframe clauses (silently ignored if present) - When
ORDER BYis not provided, uses table scan order
Arguments:
value: The column or expression to get the value fromoffset(optional): The number of rows forward from the current row. Default is 1default(optional): The value to return when the offset goes beyond the partition bounds. Default isNULL[IGNORE | RESPECT] NULLS(optional): Determines whetherNULLvalues should be ignored. Default isRESPECT NULLS
Return value:
- The value from the row at the specified offset after the current row
Syntax:
lead(value [, offset [, default]]) [(IGNORE|RESPECT) NULLS]
OVER ([PARTITION BY partition_expression] [ORDER BY sort_expression])
Example:
SELECT
timestamp,
price,
lead(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS next_price,
lead(price, 2, 0.0) OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS price_after_next
FROM trades
WHERE timestamp in today();
This example:
- Gets the next price for each symbol (
next_price) - Gets the price from 2 rows ahead (
price_after_next) - Uses 0.0 as default when looking 2 rows ahead reaches the partition end
rank()
In the context of window functions, rank() assigns a unique rank to each row within the window frame, with the same rank assigned to rows with the same values. Rows with equal values receive the same rank, and a gap appears in the sequence for the next distinct value; that is, the row_number of the first row in its peer group.
Arguments:
rank()does not require arguments.
Return value:
- The rank of each row within the window frame. Return value type is
long.
Description
When used as a window function, rank() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY and ORDER BY components of the OVER clause.
The rank() function assigns a unique rank to each row within its window, with the same rank assigned to rows with the same values in the ORDER BY clause of the OVER clause. It ignores the frame clause, meaning it considers all rows in each partition, regardless of the frame specification.
Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.
Syntax:
rank() OVER (window_definition)
Example:
SELECT
symbol,
price,
timestamp,
rank() OVER (
PARTITION BY symbol
ORDER BY price DESC
) AS price_rank
FROM trades
WHERE timestamp in today();
row_number()
In the context of window functions, row_number() assigns a unique row number to each row within the window frame. For each partition, the row number starts with one and increments by one.
Arguments:
row_number()does not require arguments.
Return value:
- The row number of each row within the window frame. Return value type is
long.
Description
When used as a window function, row_number() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY and ORDER BY components of the OVER clause.
The row_number() function assigns a unique row number to each row within its window, starting at one for the first row in each partition and incrementing by one for each subsequent row. It ignores the frame clause, meaning it considers all rows in each partition, regardless of the frame specification.
Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.
Syntax:
row_number() OVER (window_definition)
Example:
SELECT
symbol,
price,
timestamp,
row_number() OVER (
PARTITION BY symbol
ORDER BY timestamp
) AS trade_number
FROM trades
WHERE timestamp in today();
sum()
In the context of window functions, sum(value) calculates the cumulative sum of value in the set of rows defined by the window frame. Also known as "cumulative sum".
Arguments:
value: Any numeric value.
Return value:
- The sum of
valuefor the rows in the window frame.
Description
When used as a window function, sum() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY, ORDER BY, and frame specification components of the OVER clause.
The sum() function respects the frame clause, meaning it only includes rows within the specified frame in the calculation. The result is a separate value for each row, based on the corresponding window of rows.
Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.
Syntax:
sum(value) OVER (window_definition)
Example:
SELECT
symbol,
amount,
timestamp,
sum(amount) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_amount
FROM trades
WHERE timestamp in today();
Common window function examples
Moving average of best bid price
DECLARE @best_bid := bids[1,1]
SELECT
timestamp,
symbol,
@best_bid as best_bid,
avg(@best_bid) OVER (
PARTITION BY symbol
ORDER BY timestamp
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS bid_moving_avg
FROM market_data
WHERE timestamp in today();
This example:
- Uses the best bid price (
bids[1,1]) - Calculates average over 4 rows (current + 3 preceding)
- Groups by symbol
Cumulative bid size
DECLARE
@best_bid := bids[1,1],
@volume_l1 := bids[2,1]
SELECT
timestamp, symbol,
@best_bid as bid_price_l1,
@volume_l1 as bid_volume_l1,
sum(@volume_l1) OVER (
PARTITION BY symbol ORDER BY timestamp
ROWS BETWEEN 5 PRECEDING AND CURRENT ROW
) as bid_volume_l1_5secs
FROM market_data
WHERE timestamp in today();
This example:
- Shows best bid price and size
- Partitions BY symbol
- Calculates 5-row rolling volume at best bid
Moving sum of bid volume
DECLARE
@best_bid := bids[1,1],
@volume_l1 := bids[2,1]
SELECT
timestamp,
sum(@volume_l1) OVER (
ORDER BY timestamp
RANGE BETWEEN 1 minute PRECEDING AND CURRENT ROW
) as bid_volume_1min
FROM market_data
WHERE timestamp IN today() AND symbol='GBPUSD';
This example:
- Shows best bid price and size
- Filters only
GBPUSDrows seen today - Calculates rolling 1-minute volume at best bid
Side frequency analysis
SELECT
timestamp,
symbol,
COUNT(*) OVER (
ORDER BY timestamp
RANGE BETWEEN 60000000 PRECEDING AND CURRENT ROW
) as updates_per_min,
COUNT(CASE WHEN side = 'buy' THEN 1 END) OVER (
ORDER BY timestamp
RANGE BETWEEN 60000000 PRECEDING AND CURRENT ROW
) as bids_per_minute,
COUNT(CASE WHEN side = 'sell' THEN 1 END) OVER (
ORDER BY timestamp
RANGE BETWEEN 60000000 PRECEDING AND CURRENT ROW
) as asks_per_minute
FROM trades
WHERE timestamp IN today() AND symbol='BTC-USD';
This example:
- Filters only
GBPUSDrows seen today - Counts all trades in last minute (note we could also use the
1 minute PRECEDINGsyntax) - Specifically counts bids (
side='buy') per minute - Specifically counts bids (
side='ask') per minute