Indexes speed up data retrieval by maintaining a separate, sorted data structure that maps column values to row locations. Good indexing is the fastest way to fix slow database queries without changing your architecture.
Without an index, a database performs a full table scan — reading every row to find matches. With a billion rows, that's fatal to performance. An index lets the database jump directly to matching rows in O(log N) time.
-- Without index: full table scan across billions of rows
SELECT * FROM users WHERE email = 'alice@example.com';
-- With index on email: O(log N) lookup
CREATE INDEX idx_users_email ON users(email);
Most databases default to B-Tree indexes. A balanced tree structure sorted by the indexed column.
Supports:
WHERE user_id = 123WHERE age BETWEEN 20 AND 30WHERE name LIKE 'Ali%'Doesn't support: Suffix matching (WHERE name LIKE '%son'), inequality on multiple columns simultaneously
Stores a hash of each key. Only useful for exact equality lookups — no range support.
Indexes on multiple columns together.
CREATE INDEX idx_user_created ON posts(user_id, created_at);
Crucial rule — leftmost prefix principle: The index is used only when the query filters on the leading columns of the index.
-- Uses index (user_id is leftmost)
SELECT * FROM posts WHERE user_id = 5 AND created_at > '2024-01-01';
-- Uses index (user_id alone)
SELECT * FROM posts WHERE user_id = 5;
-- CANNOT use index (skips user_id)
SELECT * FROM posts WHERE created_at > '2024-01-01';
A covering index includes all columns needed by a query, eliminating the need to look up the actual row.
-- If we frequently run this query:
SELECT user_id, created_at FROM posts WHERE user_id = 5;
-- A covering index on (user_id, created_at) serves the query entirely from the index
CREATE INDEX idx_covering ON posts(user_id, created_at);
Covering indexes can dramatically improve performance for read-heavy queries.
| Aspect | Effect | |---|---| | Read performance | Dramatically improved | | Write performance | Slightly degraded (index must be updated on every insert/update/delete) | | Storage | Additional disk space per index | | Too many indexes | Can hurt write-heavy workloads |
Rule: Only index columns you actually query on. Every index has a write penalty.
WHERE clauses with high selectivity (many distinct values)JOIN conditionsORDER BY or GROUP BY clausesMATCH AGAINST in MySQL, tsvector in PostgreSQL)WHERE active = true) — smaller and fasterLOWER(email))gender (2 values) is nearly useless; an index on email (billions of unique values) is very effectiveEXPLAIN ANALYZE is how you verify whether an index is being used