Module 6 — Window Functions
Why window functions exist
Go back to Module 3. You already know how to get "total revenue per customer" — that's GROUP BY. It collapses many rows into one row per group. You lose the individual rows.
But now ask a different question: "show me every single order, and next to each one, what percentage of that customer's total revenue does it represent?"
You can't answer that with GROUP BY. The moment you group by customer, the individual order rows are gone — averaged, summed, or grouped away. You need the detail row and the aggregate, at the same time, on the same line.
That is the one-sentence reason window functions exist: they let you compute an aggregate (or a rank, or a "look at the next row") without collapsing the rows you're computing it over.
GROUP BY is the class average printed once at the bottom of the page — one number, all individual student rows are gone. A window function is a column added next to every student's own row that shows "class average" or "your rank in class" — the student's row survives, and the aggregate rides along beside it. The "window" is the group of rows (the whole class, or just the students in Section A) that the calculation is allowed to "look at" while computing that value for one row.
When to use it
- You need detail rows and an aggregate/rank/comparison in the same result set (rank, running total, % of total, comparison to previous row).
- Top-N-per-group problems ("top 3 highest paid employees per department").
- Time-series analysis: running totals, moving averages, period-over-period change.
- Deduplication: keep exactly one row per group based on some rule (latest record, highest value).
- Gap detection, streak detection, sessionization — anything where "compare this row to the row before/after it" matters.
When NOT to use it
- If you only need one aggregated row per group and don't need the detail rows — plain
GROUP BYis simpler and usually cheaper. - If you need to filter on the aggregate directly in the same query without a wrapping query — you cannot put a window function in
WHERE(more on this trap later). - Very old database versions without window function support (rare today, but some legacy MySQL < 8.0 setups still exist) — you'd fall back to correlated subqueries or variables.
How it executes internally
Conceptually, the engine runs window functions after WHERE, GROUP BY, and HAVING have already produced the row set, but before the final ORDER BY and SELECT list are applied. For each row, it:
- Figures out which other rows belong to the same "window" (the
PARTITION BYgroup, or the whole result set if there's no partition). - Within that window, sorts by
ORDER BYif one is given. - Applies the frame clause (which subset of the partition — e.g. "from the start up to this row") if one is given.
- Computes the function (rank, sum, lag, etc.) over exactly that frame, and attaches the result to the current row — without deleting any other rows.