Chapter 12 of 18

Indexes & Query Performance

Speed up queries with indexes — understand how they work, when to use them, and how to avoid common pitfalls.

Meritshot9 min read
SQLIndexesPerformanceQuery OptimizationEXPLAIN
All SQL Chapters

What Is an Index?

An index is a separate data structure that the database maintains alongside your table to make lookups faster. Think of a book's index at the back — instead of reading every page to find "foreign keys", you jump straight to the page number.

Without an index, a query WHERE email = 'priya@meritshot.com' must scan every row in the table (a full table scan). With an index on email, the database jumps directly to the matching rows in microseconds — even on a table with millions of rows.

Full table scan:  O(n)  — checks every row
Index lookup:     O(log n) — binary search on the index structure

Types of Indexes

TypeDescription
B-treeDefault; balanced tree structure; great for =, >, <, BETWEEN, ORDER BY
HashOptimised for exact-match (=) only; not useful for ranges
Full-textText search within documents (LIKE '%word%' is slow; FTS indexes fix this)
PartialIndex only rows that match a condition
CompositeIndex on multiple columns together
UniqueIndex that also enforces uniqueness

Most of the time you'll use the default B-tree index.

Creating Indexes

Basic Syntax

CREATE INDEX idx_name ON table_name (column1);
CREATE UNIQUE INDEX idx_name ON table_name (column1);  -- unique + fast lookup
CREATE INDEX IF NOT EXISTS idx_name ON table_name (column1);

Examples

-- Index on email for login lookups
CREATE INDEX idx_employees_email ON employees (email);

-- Index on hire_date for date range queries
CREATE INDEX idx_employees_hire_date ON employees (hire_date);

-- Unique index (same effect as UNIQUE constraint but created separately)
CREATE UNIQUE INDEX idx_employees_email_unique ON employees (email);

Composite Indexes

An index on multiple columns:

-- Index on (department, salary) together
CREATE INDEX idx_dept_salary ON employees (department, salary);

This index is useful for queries filtering on both columns:

-- Uses the composite index efficiently
SELECT name FROM employees
WHERE department = 'Finance'
  AND salary > 75000;

The order of columns in a composite index matters — a (A, B) index helps queries filtering on A alone or on (A, B) together. It does not help queries filtering on B alone.

When Indexes Are Used (and When They're Not)

Index IS Used

-- Equality on indexed column
WHERE email = 'priya@meritshot.com'

-- Range on indexed column
WHERE salary BETWEEN 70000 AND 90000

-- Leftmost column of composite index
WHERE department = 'Finance'

-- ORDER BY on indexed column (may avoid sorting step)
ORDER BY hire_date DESC LIMIT 10

Index IS NOT Used

-- Function on indexed column destroys index usage
WHERE LOWER(email) = 'priya@meritshot.com'    -- LOWER() breaks it
WHERE YEAR(hire_date) = 2022                   -- YEAR() breaks it

-- Leading wildcard in LIKE — must scan
WHERE name LIKE '%Sharma'    -- leading % means "can be anything before"

-- Type mismatch
WHERE id = '5'              -- implicit cast can prevent index use

-- Very low cardinality column
WHERE is_active = 1         -- boolean with 2 values; full scan is often faster

Workarounds

-- Instead of LOWER(email), store email in lowercase
-- Or use a functional index (PostgreSQL)
CREATE INDEX idx_email_lower ON employees (LOWER(email));
SELECT * FROM employees WHERE LOWER(email) = 'priya@meritshot.com';  -- now uses index

-- Instead of YEAR(hire_date):
WHERE hire_date BETWEEN '2022-01-01' AND '2022-12-31'  -- range on raw column

EXPLAIN — Understanding Query Execution

EXPLAIN (or EXPLAIN QUERY PLAN in SQLite) shows how the database plans to execute your query:

-- SQLite
EXPLAIN QUERY PLAN
SELECT name, salary FROM employees WHERE email = 'priya@meritshot.com';

-- PostgreSQL
EXPLAIN ANALYZE
SELECT name, salary FROM employees WHERE email = 'priya@meritshot.com';

-- MySQL
EXPLAIN
SELECT name, salary FROM employees WHERE email = 'priya@meritshot.com';

Without index:

SCAN TABLE employees    -- reads every row

With index on email:

SEARCH TABLE employees USING INDEX idx_employees_email (email=?)    -- fast!

Look for these terms in the output:

  • SCAN / Seq Scan — full table scan (slow on large tables)
  • SEARCH USING INDEX / Index Scan — using an index (fast)
  • USING COVERING INDEX — all needed columns are in the index (very fast, no table lookup needed)

Dropping Indexes

DROP INDEX idx_employees_email;
DROP INDEX IF EXISTS idx_employees_email;

Automatically Created Indexes

Databases automatically create indexes for:

  • PRIMARY KEY — always indexed
  • UNIQUE constraint — always indexed
  • FOREIGN KEY — MySQL creates one automatically; PostgreSQL does not
-- No need to manually create:
CREATE TABLE employees (
    id    INTEGER PRIMARY KEY,  -- implicit index
    email TEXT    UNIQUE        -- implicit unique index
);

Covering Indexes

A covering index is one that contains all the columns needed by the query — the database never has to touch the actual table rows:

-- Query needs name, salary, and filters on department
SELECT name, salary FROM employees WHERE department = 'Finance';

-- Covering index for this query
CREATE INDEX idx_dept_cover ON employees (department, name, salary);
-- Index contains department (filter), name and salary (output)
-- Query answered entirely from the index — no table rows needed

Partial Indexes

Index only a subset of rows (PostgreSQL, SQLite):

-- Only index active employees (not inactive ones taking up index space)
CREATE INDEX idx_active_employees
ON employees (name)
WHERE is_active = 1;

-- Only effective when your query also has the same WHERE:
SELECT name FROM employees WHERE is_active = 1 AND name LIKE 'P%';

Performance Guidelines

When to Add an Index

  • Columns frequently used in WHERE, JOIN ON, ORDER BY, or GROUP BY
  • High-cardinality columns (many unique values) — email, ID, timestamps
  • Foreign key columns (to speed up joins)
  • Columns used in range queries (BETWEEN, >, <)

When NOT to Add an Index

  • Very small tables (full scan is fine for <1,000 rows)
  • Columns with very few unique values (e.g., is_active with only 1 or 0) — low cardinality
  • Columns rarely used in WHERE or JOIN
  • Tables with very frequent writes (each index adds overhead to INSERT/UPDATE/DELETE)

The Write Overhead Trade-off

Every index speeds up reads but slows down writes. When you INSERT a row, the database must update every index on that table:

  • 5 indexes on a table → every INSERT writes to 6 places (table + 5 indexes)
  • This is fine for read-heavy tables; problematic for write-heavy tables

Rule of Thumb

Start with no indexes beyond the PK and FKs. Profile slow queries with EXPLAIN. Add targeted indexes only for proven slow queries.

Practical Examples

Example 1: Optimise a Login Query

-- Slow without index on email (full scan on millions of rows)
SELECT id, name FROM users WHERE email = 'user@example.com';

-- Add an index
CREATE UNIQUE INDEX idx_users_email ON users (email);

-- Now the query is a fast index lookup

Example 2: Optimise a Date Range Report

-- Monthly report — slow without index
SELECT COUNT(*), SUM(amount)
FROM orders
WHERE order_date BETWEEN '2026-01-01' AND '2026-01-31';

-- Index on order_date
CREATE INDEX idx_orders_date ON orders (order_date);

Example 3: Optimise a Join

-- Join often on employee_id in orders
SELECT e.name, COUNT(o.id) AS order_count
FROM employees e
JOIN orders o ON e.id = o.employee_id
GROUP BY e.name;

-- Index the foreign key column (PostgreSQL doesn't auto-create it)
CREATE INDEX idx_orders_employee ON orders (employee_id);

Common Mistakes

1. Indexing every column

Too many indexes hurt write performance significantly. Only index what's proven slow.

2. Using a function on an indexed column

-- Breaks the index
WHERE UPPER(name) = 'PRIYA SHARMA'

-- Use a functional index instead (PostgreSQL)
CREATE INDEX idx_name_upper ON employees (UPPER(name));

3. Assuming indexes are always used

The query planner may choose a full scan if the table is small or the selectivity is low. Use EXPLAIN to verify.

4. Forgetting the left-prefix rule for composite indexes

-- Index on (A, B, C)
WHERE A = 1             -- uses index
WHERE A = 1 AND B = 2   -- uses index
WHERE A = 1 AND B = 2 AND C = 3  -- uses index
WHERE B = 2             -- CANNOT use this index (A is not specified)
WHERE B = 2 AND C = 3   -- CANNOT use this index

Practice Exercises

  1. Create an index on the employees table for the hire_date column. Then write a query that would use it.
  2. What is the difference between a regular index and a unique index?
  3. Create a composite index on (department, hire_date). Which queries would use it and which wouldn't?
  4. Run EXPLAIN on a query that filters by department. Does it use an index?
  5. Why should you NOT add an index on a column like is_active that only has two possible values?

Summary

In this chapter you learned:

  • An index is a separate data structure that makes lookups fast (O(log n) vs O(n))
  • Default index type: B-tree — works for =, ranges, ORDER BY
  • CREATE INDEX idx_name ON table (column) — add an index
  • Composite indexes: column order matters; only left-prefix queries benefit
  • Functions on indexed columns (e.g., LOWER(col)) usually break index usage
  • Use EXPLAIN / EXPLAIN ANALYZE to see if your query uses an index
  • PRIMARY KEY and UNIQUE constraints automatically create indexes
  • Trade-off: indexes speed reads, slow writes — don't over-index
  • Covering indexes include all query columns — avoid table row lookups entirely
  • Partial indexes (PostgreSQL/SQLite) index only rows matching a condition

Next up: Views — create reusable, named query results.