Chapter 15 of 18

Window Functions

Perform advanced analytics without collapsing rows — RANK, ROW_NUMBER, DENSE_RANK, LAG, LEAD, running totals, and moving averages.

Meritshot10 min read
SQLWindow FunctionsRANKROW_NUMBERLAGLEADOVERPARTITION BY
All SQL Chapters

What Are Window Functions?

Window functions perform calculations across a "window" of rows related to the current row — without collapsing rows into groups the way GROUP BY does. Each row in the result retains its identity; the window function adds a computed column alongside it.

-- GROUP BY collapses rows (one output per group)
SELECT department, AVG(salary) FROM employees GROUP BY department;
-- Returns 3 rows (one per department)

-- Window function keeps all rows + adds a computed column
SELECT name, department, salary,
       AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees;
-- Returns 8 rows (one per employee) with dept_avg added

Window functions are one of the most powerful features in modern SQL — they enable rankings, running totals, moving averages, and period-over-period comparisons without subqueries or self-joins.

Syntax

function_name() OVER (
    [PARTITION BY col1, col2, ...]
    [ORDER BY col3 [ASC|DESC]]
    [ROWS/RANGE BETWEEN ... AND ...]
)
  • OVER () — marks it as a window function; empty = entire table is the window
  • PARTITION BY — divide rows into groups (windows); like GROUP BY but without collapsing
  • ORDER BY — order within each partition (required for ranking and frame functions)
  • ROWS/RANGE BETWEEN — define the frame (subset of the window) for each row

Sample Data

CREATE TABLE employees (
    id         INTEGER PRIMARY KEY,
    name       TEXT,
    department TEXT,
    salary     REAL,
    hire_date  TEXT
);

INSERT INTO employees VALUES
(1,  'Priya Sharma',  'Finance',    75000, '2022-03-15'),
(2,  'Raj Patel',     'Technology', 92000, '2021-07-01'),
(3,  'Meera Singh',   'Marketing',  68000, '2023-01-10'),
(4,  'Arjun Nair',    'Finance',    81000, '2020-11-22'),
(5,  'Sunita Rao',    'Technology', 88000, '2022-09-05'),
(6,  'Dev Kumar',     'Marketing',  72000, '2021-04-18'),
(7,  'Kiran Mehta',   'Finance',    95000, '2019-06-30'),
(8,  'Ananya Das',    'Technology', 63000, '2024-02-14'),
(9,  'Vikram Joshi',  'Marketing',  77000, '2020-08-12'),
(10, 'Ritu Bansal',   'Finance',    69000, '2023-07-01');

Ranking Functions

ROW_NUMBER()

Assigns a unique sequential number to each row within the partition. No ties — even if two rows have equal values, they get different numbers:

SELECT
    name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees
ORDER BY department, row_num;

Result:

name         | department | salary | row_num
-------------+------------+--------+--------
Kiran Mehta  | Finance    | 95000  | 1
Arjun Nair   | Finance    | 81000  | 2
Priya Sharma | Finance    | 75000  | 3
Ritu Bansal  | Finance    | 69000  | 4
Raj Patel    | Technology | 92000  | 1
Sunita Rao   | Technology | 88000  | 2
Ananya Das   | Technology | 63000  | 3
Vikram Joshi | Marketing  | 77000  | 1
Dev Kumar    | Marketing  | 72000  | 2
Meera Singh  | Marketing  | 68000  | 3

Use case: Paginate results per group; assign unique IDs.

RANK()

Like ROW_NUMBER but gives the same rank to tied rows. Numbers skip after a tie:

SELECT name, salary,
       RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;
-- If two people have the same salary, they both get rank 2, and next is rank 4

DENSE_RANK()

Like RANK but numbers don't skip:

SELECT name, salary,
       RANK()       OVER (ORDER BY salary DESC) AS rank,
       DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
salaryRANKDENSE_RANK
9500011
9200022
9200022
8800043

NTILE(n)

Divides rows into n buckets of roughly equal size:

-- Divide employees into 4 salary quartiles
SELECT name, salary,
       NTILE(4) OVER (ORDER BY salary) AS quartile
FROM employees;

Employees in quartile 1 are the bottom 25%, quartile 4 are the top 25%.

PERCENT_RANK() and CUME_DIST()

SELECT
    name,
    salary,
    ROUND(PERCENT_RANK() OVER (ORDER BY salary), 2) AS pct_rank,
    ROUND(CUME_DIST()    OVER (ORDER BY salary), 2) AS cum_dist
FROM employees;
  • PERCENT_RANK: percentage of rows with strictly lower rank (0 to 1)
  • CUME_DIST: percentage of rows with value ≤ current row (0 to 1)

Top-N Per Group

Classic use of ROW_NUMBER — find the top N records in each group:

-- Top earner in each department
SELECT name, department, salary
FROM (
    SELECT
        name, department, salary,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
    FROM employees
) ranked
WHERE rn = 1;
name        | department | salary
------------+------------+-------
Kiran Mehta | Finance    | 95000
Raj Patel   | Technology | 92000
Vikram Joshi| Marketing  | 77000
-- Top 2 earners per department
WHERE rn <= 2

Aggregate Window Functions

Any aggregate function can be used as a window function by adding OVER:

SELECT
    name,
    department,
    salary,
    AVG(salary)   OVER (PARTITION BY department) AS dept_avg,
    MAX(salary)   OVER (PARTITION BY department) AS dept_max,
    MIN(salary)   OVER (PARTITION BY department) AS dept_min,
    COUNT(*)      OVER (PARTITION BY department) AS dept_headcount,
    SUM(salary)   OVER (PARTITION BY department) AS dept_payroll,
    ROUND(salary / SUM(salary) OVER (PARTITION BY department) * 100, 1) AS pct_of_dept
FROM employees
ORDER BY department, salary DESC;
name         | dept       | salary | dept_avg | dept_max | pct_of_dept
-------------+------------+--------+----------+----------+------------
Kiran Mehta  | Finance    | 95000  | 80000    | 95000    | 29.8
Arjun Nair   | Finance    | 81000  | 80000    | 95000    | 25.4
Priya Sharma | Finance    | 75000  | 80000    | 95000    | 23.5
Ritu Bansal  | Finance    | 69000  | 80000    | 95000    | 21.7
...

LAG and LEAD

Access a value from a previous (LAG) or subsequent (LEAD) row within the partition:

-- LAG: value from previous row
LAG(column, offset, default) OVER (PARTITION BY ... ORDER BY ...)

-- LEAD: value from next row
LEAD(column, offset, default) OVER (PARTITION BY ... ORDER BY ...)
-- Period-over-period comparison (month by month)
CREATE TABLE monthly_revenue (
    month   TEXT,
    revenue REAL
);
INSERT INTO monthly_revenue VALUES
('2026-01', 120000), ('2026-02', 135000), ('2026-03', 118000),
('2026-04', 142000), ('2026-05', 158000), ('2026-06', 145000);

SELECT
    month,
    revenue,
    LAG(revenue, 1, 0)  OVER (ORDER BY month) AS prev_month_revenue,
    revenue - LAG(revenue, 1, 0) OVER (ORDER BY month) AS mom_change,
    ROUND(
        (revenue - LAG(revenue, 1) OVER (ORDER BY month))
        / LAG(revenue, 1) OVER (ORDER BY month) * 100,
    1) AS mom_pct_change
FROM monthly_revenue;

Result:

month   | revenue | prev_month | mom_change | mom_pct_change
--------+---------+------------+------------+---------------
2026-01 | 120000  | 0          | 120000     | NULL
2026-02 | 135000  | 120000     | 15000      | 12.5
2026-03 | 118000  | 135000     | -17000     | -12.6
2026-04 | 142000  | 118000     | 24000      | 20.3
2026-05 | 158000  | 142000     | 16000      | 11.3
2026-06 | 145000  | 158000     | -13000     | -8.2

Running Totals and Moving Averages (ROWS BETWEEN)

The ROWS BETWEEN clause defines a frame — the subset of rows to include in the aggregate:

SELECT
    month,
    revenue,
    -- Running total (cumulative sum up to current row)
    SUM(revenue) OVER (ORDER BY month ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
        AS running_total,
    -- 3-month moving average
    ROUND(AVG(revenue) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW), 0)
        AS moving_avg_3m
FROM monthly_revenue;

Result:

month   | revenue | running_total | moving_avg_3m
--------+---------+---------------+--------------
2026-01 | 120000  | 120000        | 120000
2026-02 | 135000  | 255000        | 127500
2026-03 | 118000  | 373000        | 124333
2026-04 | 142000  | 515000        | 131667
2026-05 | 158000  | 673000        | 139333
2026-06 | 145000  | 818000        | 148333

Frame Boundaries

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW   -- from start to current row (cumulative)
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW           -- current + 2 previous rows
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING   -- current row to end
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING           -- current + one on each side (centered average)
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING  -- entire partition (same as no frame)

FIRST_VALUE and LAST_VALUE

SELECT
    name,
    department,
    salary,
    FIRST_VALUE(name) OVER (PARTITION BY department ORDER BY salary DESC) AS top_earner,
    LAST_VALUE(name)  OVER (
        PARTITION BY department ORDER BY salary DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS lowest_earner
FROM employees
ORDER BY department, salary DESC;

Note: LAST_VALUE requires expanding the frame to UNBOUNDED FOLLOWING, otherwise it defaults to the current row and returns the current name.

Practical Examples

Example 1: Employee Salary Ranking Within Company

SELECT
    name,
    department,
    salary,
    RANK()       OVER (ORDER BY salary DESC) AS company_rank,
    RANK()       OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
    ROUND(salary / SUM(salary) OVER () * 100, 1) AS pct_of_payroll
FROM employees
ORDER BY company_rank;

Example 2: Identify Who Earns More Than Their Manager

WITH emp AS (
    SELECT
        e.id,
        e.name,
        e.salary,
        m.name   AS manager_name,
        m.salary AS manager_salary
    FROM employees e
    LEFT JOIN employees m ON e.manager_id = m.id
)
SELECT name, salary, manager_name, manager_salary
FROM emp
WHERE salary > manager_salary;

Example 3: Percentile Salary Bands

SELECT
    name,
    salary,
    NTILE(10) OVER (ORDER BY salary) AS decile,
    ROUND(PERCENT_RANK() OVER (ORDER BY salary) * 100, 1) AS percentile_rank
FROM employees
ORDER BY salary;

Common Mistakes

1. Trying to use window functions in WHERE

-- Wrong — window functions aren't allowed in WHERE
SELECT name, salary, RANK() OVER (ORDER BY salary) AS rnk
FROM employees
WHERE rnk <= 3;   -- Error!

-- Correct — wrap in a subquery or CTE
SELECT * FROM (
    SELECT name, salary, RANK() OVER (ORDER BY salary) AS rnk
    FROM employees
) r
WHERE rnk <= 3;

2. LAST_VALUE returning unexpected results

-- Without frame, LAST_VALUE only looks at rows up to current
-- which means it returns the current row's value
LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary)  -- WRONG

-- Fix: expand the frame
LAST_VALUE(salary) OVER (
    PARTITION BY department ORDER BY salary
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)

3. Confusing RANK and DENSE_RANK with tied values

Use RANK when "gaps" in numbering are acceptable; use DENSE_RANK when you want consecutive numbers regardless of ties.

Practice Exercises

  1. Assign each employee a rank within their department based on salary (highest = rank 1).
  2. Show each employee's salary alongside their department's average, minimum, and maximum salary.
  3. Calculate a running total of salaries when ordered by hire date (oldest hire first).
  4. Using LAG, show each employee's salary and the salary of the person hired just before them.
  5. Find the top 2 earners in each department using ROW_NUMBER.

Summary

In this chapter you learned:

  • Window functions add a computed column per row without collapsing rows
  • OVER () marks a function as a window function
  • PARTITION BY divides rows into groups (partitions)
  • ORDER BY within OVER defines row order within each partition
  • Ranking: ROW_NUMBER() (unique), RANK() (gaps on ties), DENSE_RANK() (no gaps), NTILE(n)
  • Offset: LAG(col, n) (n rows back), LEAD(col, n) (n rows ahead)
  • Aggregate windows: SUM(), AVG(), COUNT(), MIN(), MAX() with OVER
  • ROWS BETWEEN x AND y — define the frame for running/moving aggregates
  • Top-N per group: use ROW_NUMBER in a subquery, filter WHERE rn <= N
  • Window functions cannot appear in WHERE — wrap in a subquery or CTE

Next up: CTEs & Set Operations — write cleaner queries with WITH clauses and combine result sets with UNION, INTERSECT, EXCEPT.