Top N rows from each group
Sample data
| id | name | dep | salary |
|---|---|---|---|
| 1 | Mike | Management | 5000 |
| 2 | Devon | Accounting | 1000 |
| 3 | Jane | Accounting | 1500 |
| 4 | Chiara | IT | 2000 |
| 5 | Alex | IT | 3000 |
| 6 | Tim | IT | 3000 |
Problem
For each department, find the N employee(s) with the highest salary.
Solutions
select distinct on (dep)
*
from
employees
order by
dep
, salary desc
id | name | dep | salary
----+------+------------+--------
3 | Jane | Accounting | 1500
5 | Alex | IT | 3000
1 | Mike | Management | 5000
Note
With DISTINCT ON we are limited to exactly the first row from each group.
If there are multiple rows with the same salary in a group,
provide additional ORDER BY columns to resolve the ambiguity.
Note
The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).
ORDER BY will usually contain additional expression(s)
that determine the desired precedence of rows within each group.
DISTINCT ON accepts multiple columns.
select distinct on (department, sub_department)
...
DISTINCT ON is a PostgreSQL extension to the SQL standard.
select
e.*
from
employees e
, lateral (
select
*
from
employees
where
dep = e.dep
order by salary desc
limit 1 -- or limit N
) sq
where
e.id = sq.id
order by
dep, salary desc
id | name | dep | salary
----+------+------------+--------
3 | Jane | Accounting | 1500
5 | Alex | IT | 3000
1 | Mike | Management | 5000
Note
If there are multiple rows with the same salary in a group,
provide additional ORDER BY columns to the lateral subquery
to resolve the ambiguity.
Note
The , in the FROM clause is shorthand for CROSS JOIN.
select
*
from
(
select
*
, rank() over (partition by dep order by salary desc) rank
from
employees
) sq
where
rank = 1 -- or rank <= N
id | name | dep | salary | rank
----+------+------------+--------+------
3 | Jane | Accounting | 1500 | 1
5 | Alex | IT | 3000 | 1
6 | Tim | IT | 3000 | 1
1 | Mike | Management | 5000 | 1
select
*
from
employees e
where
id = (
select
id
from
employees
where
dep = e.dep
order by salary desc
limit 1 -- or limit N
)
)
id | name | dep | salary
----+------+------------+--------
1 | Mike | Management | 5000
3 | Jane | Accounting | 1500
5 | Alex | IT | 3000
Note
If there are multiple rows with the same salary in a group,
provide additional ORDER BY columns to the subquery
to resolve the ambiguity.