How Indexes Actually Speed Up Queries - Open Source Software News

What your database is really doing while you wait

The query looks innocent.

SELECT *
FROM users
WHERE email = '[email protected]';

You hit Enter.
The database pauses.
Your app feels… sluggish.

You add an index, rerun the query, and suddenly it’s instant.

Most engineers stop there.

But why did that happen?
What changed inside the database engine?

Let’s pop the hood.

The Naive Mental Model (and Why It’s Wrong)

A common explanation goes like this:

“An index helps the database jump directly to the row.”

That sounds reasonable — but it’s not what actually happens.

Databases don’t “jump” inside tables. Rows are scattered across disk pages. Without extra structures, the engine has no idea where a specific value lives.

So let’s start with what really happens.

Life Without an Index: The Full Table Scan

With no index on email, the database has exactly one option: read everything.

SELECT *
FROM users
WHERE email = '[email protected]';

Execution plan (PostgreSQL)

Seq Scan on users
  Filter: (email = '[email protected]')

What this actually means

Read page 1 → check every row
Read page 2 → check every row
Repeat until the table ends

This is a sequential scan.

Why it’s slow

Disk I/O dominates
CPU evaluates the condition for every row
Work grows linearly with table size

Time complexity:

O(n)

For large tables, this is unavoidable — and painful.

What an Index Really Is

An index is not a shortcut inside your table.

It is a separate physical data structure, stored independently on disk.

In most relational databases, that structure is a B-Tree.

Conceptually, each index entry looks like:

email_value → (page_id, row_offset)

Key points:

The index is sorted
The table remains unchanged
The index provides an alternative access path

Think of it as a highly optimized lookup table pointing to your data, not containing it.

The Same Query With an Index

Now let’s add an index:

CREATE INDEX idx_users_email ON users(email);

Execution plan

Index Scan using idx_users_email on users
  Index Cond: (email = '[email protected]')

This single line tells an entirely different story.

What the Database Actually Does With an Index

Step 1: Traverse the B-Tree

Start at the root node
Move through internal nodes
Reach the leaf node containing ‘[email protected]’

Each step is one page read.

Step 2: Fetch the row

Index returns a row pointer (CTID / RowID)
Database reads the table page
Row is returned

Why this is fast

B-Trees are shallow (usually 3–4 levels)
Each node stores many keys per page
Disk seeks are minimized

Time complexity:

O(log n)

Even with millions of rows, this often means single-digit page reads.

Comparing the Plans Side by Side

Without index

Seq Scan on users
  Filter: (email = '[email protected]')

Reads the entire table
CPU-heavy
Disk-bound

With index

Index Scan using idx_users_email on users
  Index Cond: (email = '[email protected]')

Reads a handful of index pages
One table lookup
Predictable performance

This is the real speedup, not magic, just less work.

Index-Only Scans: When the Table Isn’t Touched

Sometimes the database doesn’t even need the table.

Query

SELECT email
FROM users
WHERE email = '[email protected]';

Execution plan

Index Only Scan using idx_users_email on users
  Index Cond: (email = '[email protected]')

Because the index already contains email, the engine can answer the query without touching the table at all.

This is about as fast as it gets.

Why Indexes Make ORDER BY and JOIN Faster

ORDER BY

Indexes are already sorted.

SELECT *
FROM users
ORDER BY email;

If email is indexed:

No sorting step
Rows are returned in order

The execution plan simply walks the index.

JOIN

Indexes shine in joins.

SELECT *
FROM orders o
JOIN users u ON o.user_id = u.id;

With an index on users(id):

The database finds matching rows quickly
Avoids nested full scans
Enables efficient join strategies

Indexes don’t just speed up queries — they change the shape of execution plans.

The Hidden Cost: Why Writes Get Slower

Every index comes with a price.

On INSERT, UPDATE, or DELETE, the database must:

Modify the table
Update each index
Potentially rebalance B-Trees

Result

Reads → faster
Writes → slower
Too many indexes → write amplification

This is why high-write systems are selective about indexing.

When Indexes Don’t Help (and Are Ignored)

Databases are smart enough not to use indexes when they’re pointless.

Common cases:

Low selectivity
(e.g., a status column where 90% of rows match)
Functions on indexed columns

WHERE LOWER(email) = '[email protected]'

Wrong column order in composite indexes
Planner estimates a sequential scan is cheaper

An index is only useful if it meaningfully reduces work.

The Mental Model to Keep

Indexes don’t make queries fast by coincidence.

They:

Replace linear scans with logarithmic lookups
Minimize disk I/O
Enable better execution strategies

Next time you add an index, ask yourself:

Which work am I removing — and what new work am I adding?

That’s the difference between using indexes and understanding them.

If you want to go deeper into this , especially for interviews, Grokking the SQL Interview by javinpaul is one of the best resources I’ve found. It trains you to reason about queries, indexes, and performance the way real database engines do, not just pass syntax checks.

link: https://gumroad.com/a/1036063859/fhmehw

How Indexes Actually Speed Up Queries was originally published in Javarevisited on Medium, where people are continuing the conversation by highlighting and responding to this story.

This post first appeared on Read More

Life Without an Index: The Full Table Scan

Execution plan (PostgreSQL)

What this actually means

Why it’s slow

What an Index Really Is

Conceptually, each index entry looks like:

The Same Query With an Index

Execution plan

What the Database Actually Does With an Index

Step 1: Traverse the B-Tree

Step 2: Fetch the row

Why this is fast

Comparing the Plans Side by Side

Without index

With index

Index-Only Scans: When the Table Isn’t Touched

Query

Execution plan

Why Indexes Make ORDER BY and JOIN Faster

ORDER BY

JOIN

The Hidden Cost: Why Writes Get Slower

Result

When Indexes Don’t Help (and Are Ignored)

Common cases:

The Mental Model to Keep

Related Posts