Summary: in this tutorial, you will learn how to calculate a cumulative distribution of a value in a set of values by using the Oracle CUME_DIST()
function.
Introduction to Oracle CUME_DIST() function #
Sometimes, you want to pull the top or bottom x%
values from a data set e.g., top 5%
salesman by volume. To do this, you can use the CUME_DIST()
function.
The CUME_DIST()
function is an analytic function that calculates the cumulative distribution of a value in a set of values.
Here’s the basic syntax of CUME_DIST()
function:
CUME_DIST() OVER ( [ partition_clause ] order_by_clause )
Code language: SQL (Structured Query Language) (sql)
The ORDER BY
clause specifies the order of rows in each partition or result set. Since CUME_DIST()
function is order sensitive, the order_by_clause
is mandatory.
The order_by_clause
has the following syntax:
ORDER BY expression1 [ASC | DESC ] [NULLS FIRST | LAST] [, expression2 [ASC | DESC ] [NULLS FIRST | LAST],... ]
Code language: SQL (Structured Query Language) (sql)
The PARTITION BY
clause divides the result set into multiple partitions. The partition_clause
has the following form:
PARTITION BY expression1 [,expression2,..]
Code language: SQL (Structured Query Language) (sql)
The partition by clause is optional. If you omit this clause, the CUME_DIST()
function will treat the whole result set as a single partition.
The CUME_DIST
function uses the following formula for calculation:
CUME_DIST = (number of rows with value <= current value) / total rows
Code language: SQL (Structured Query Language) (sql)
It returns a result that is between 0.0
(0%
) and 1.0
(100%
). The tied values receive the same cumulative distribution.
Basic Oracle CUME_DIST function example #
First, create a new table called scores
with two columns id
and score
:
CREATE TABLE scores ( id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY, score NUMBER NOT NULL );
Code language: SQL (Structured Query Language) (sql)
Second, insert some rows into the scores
table:
INSERT INTO scores(score) VALUES(80); INSERT INTO scores(score) VALUES(90); INSERT INTO scores(score) VALUES(90); INSERT INTO scores(score) VALUES(100);
Code language: SQL (Structured Query Language) (sql)
Third, retrieve data from the scores
table:
SELECT * FROM scores;
Code language: SQL (Structured Query Language) (sql)
Output:
ID SCORE ---------- ---------- 1 80 2 90 3 90 4 100
Code language: SQL (Structured Query Language) (sql)
Finally, calculate the cumulative distribution of a score within a set of scores:
SELECT id, score, CUME_DIST() OVER ( ORDER BY score ) AS cume_dist FROM scores;
Code language: SQL (Structured Query Language) (sql)
Output:
ID SCORE CUME_DIST ---------- ---------- ---------- 1 80 .25 2 90 .75 3 90 .75 4 100 1
Code language: SQL (Structured Query Language) (sql)
How it works:
In this example, the result set has 4 rows. The ORDER BY
clause in the CUME_DIST
function sorts the scores from low to high.
Row | Score | CUME_DIST | Value |
---|---|---|---|
1 | 80 | 1/4 (only 1 row ≤ 80) | 0.25 |
2 | 90 | 3/4 (3 rows ≤ 90) | 0.75 |
3 | 90 | 3/4 (same as above, tied value) | 0.75 |
4 | 100 | 4/4 (all rows ≤ 100) | 1.00 |
Using Oracle CUME_DIST() function over a result set #
The following statement uses CUME_DIST()
function to calculate the sales percentile for each salesman in 2017:
SELECT salesman_id, sales, ROUND(CUME_DIST() OVER (ORDER BY sales DESC) * 100,2) || '%' cume_dist FROM salesman_performance WHERE year = 2017;
Code language: SQL (Structured Query Language) (sql)
Here is the result:

As shown in the output, 33.33 % of salesman have sales amounts greater than 1.99 million.
Using Oracle CUME_DIST() function over partitions #
The following statement uses the CUME_DIST
function to calculate the sales percentile for each salesman in 2016
and 2017
.
SELECT salesman_id, year, sales, ROUND(CUME_DIST() OVER ( PARTITION BY year ORDER BY sales DESC ) * 100,2) || '%' cume_dist FROM salesman_performance WHERE year in (2016, 2017);
Code language: SQL (Structured Query Language) (sql)
Here is the output:

In this example:
- The
PARTITION BY
clause divides the result set into two partitions by year, 2016 and 2017. - The
ORDER BY
clause sorts the rows in each partition by sales amount in descending order to which theCUME_DIST()
function applies.
To get the top 30% of the salesman by sales revenue in 2016 and 2017, you use the following query:
WITH cte_sales AS ( SELECT salesman_id, year, sales, ROUND(CUME_DIST() OVER ( PARTITION BY year ORDER BY sales DESC ),2) cume_dist FROM salesman_performance WHERE year in (2016,2017) ) SELECT * FROM cte_sales WHERE cume_dist <= 0.30;
Code language: SQL (Structured Query Language) (sql)
Output:

Summary #
- Use the Oracle
CUME_DIST()
function to calculate the cumulative distribution of a value in a set of values.