Understanding HashMap Internal Working

Deep Dive into HashMap Structure, Bucket Management and Collision Handling

A HashMap is a data structure that implements the Map interface and allows us to store key-value pairs. Let’s break down how it works internally.

Basic Structure

At its core, a HashMap uses an array of “buckets” to store data. Each bucket can hold multiple key-value pairs through a linked list or tree structure.

Basic Structure — HashMap

What Does a Bucket Contain in a HashMap?

A bucket in a HashMap is simply a slot in an array where entries are stored.

Each bucket can contain one or more entries. An entry is an object that holds the key-value pair and sometimes additional information like the hash code and pointer to the next entry (in case of collisions).

Inside illustration of the Entry — HashMap
  1. Hash Code — The value that the hash function generates from the key to determine where to store it in the bucket.
  2. Key — The unique identifier for the data you want to store.
  3. Value — The actual data associated with the key.
  4. Next (if collision occurs) — If another entry with the same hash code is added, a linked list is formed, and the next entry points to the next one.

How It Works

Put Method Implementation

  1. Calculate hash code of the key using key.hashCode()
  2. Apply hash function to determine bucket index: index = hash & (n-1)
  3. If bucket is empty:
  • Create new Node/Entry and place it there

4. If bucket is not empty:

  • Check if key already exists by comparing hash codes and using equals()
  • If key exists, replace old value with new value
  • If key doesn’t exist, add new Node/Entry to the chain
PUT Method — HashMap

Get Method Implementation

  1. Calculate hash code of the key using key.hashCode()
  2. Find bucket using same hash function: index = hash & (n-1)
  3. If bucket is empty:
  • Return null

4. If bucket is not empty:

  • Traverse the chain (linked list or tree)
  • Compare hash codes and use equals() to find matching key
  • Return value if key found, null otherwise
Get Method — HashMap

⚡ Performance Tip: Both get() and put() methods have O(1) average time complexity, but can degrade to O(log n) or O(n) in case of many collisions.

Collision Handling

A collision happens when two different keys hash to the same index (bucket) in a HashMap. Since a HashMap uses a hash function to determine where to store values, sometimes different keys can end up in the same spot.

In older versions of Java, the HashMap handles collisions by storing the entries as a LinkedList in that bucket. In newer versions (Java 8 and later), if a bucket has too many entries, the HashMap converts the linked list to a Red Black Tree for better performance. This helps maintain efficient operations, even when many keys collisions in the same bucket.

Why a Tree Structure Instead of a Linked List?

  • Linked List: In traditional chaining, buckets with collisions store entries in a linked list. If many keys hash to the same bucket, the linked list can grow long, causing O(n) search times for operations like get() or put(), where nis the number of entries.
  • Tree Structure (Red-Black Tree): In Java 8 and later, when the linked list in a bucket exceeds a threshold (typically 8 entries), the HashMap converts it into a red-black tree. This tree is a balanced binary search tree, maintaining a logarithmic height (O(log n)), which allows for much faster search times compared to a linked list, especially as the number of entries grows.

This improvement optimizes the performance of HashMap in scenarios with many collisions.

Important Characteristics

  • Initial Capacity: Default size of the underlying array (16 in Java)
  • Load Factor: Threshold that determines when to resize (0.75 by default)
  • Rehashing: Process of creating a new larger array when load factor is exceeded

Performance

Average case time complexity:

  • Put operation: O(1)
  • Get operation: O(1)
  • Remove operation: O(1)

Worst case (many collisions): O(log n) with tree structure, O(n) with linked list

💡 Remember: A good hashCode() implementation is crucial for HashMap performance, as it helps distribute entries more evenly across buckets, reducing the likelihood of collisions and improving retrieval times.


Understanding HashMap Internal Working was originally published in Javarevisited on Medium, where people are continuing the conversation by highlighting and responding to this story.

This post first appeared on Read More