Some Interesting Things About HashMap Indexing in Java

Understanding how (n — 1) & hash and h ^ (h >>> 16) affect collision resolution and performance in Java HashMaps.

When a HashMap calculates the bucket index, it does this:

index = (n - 1) & hash

This is the core of how HashMap calculates the bucket index for a given key. It uses a bitmask instead of modulo % for speed.

Why use (n — 1) & hash ?

In Java, HashMap tables are always sized to a power of two e.g., 16, 32, 64, 128, .. (not random sizes like 10 or 15).

Why?

Because when n is a power of two, say n = 2^k, then

  • n in binary looks like 1 followed by k zeros
  • So n — 1 becomes k ones in the lowest bits

This makes (n — 1) a perfect bitmask for extracting the lowest k bits of any hash.

Example 1: n = 16

n        = 16        = 0001 0000  (binary)
n - 1 = 15 = 0000 1111 → last 4 bits = 1

This means:

index =  0b00001111 & hash

This will give a value between 0 and 15 (0 to n — 1).

Example 2: n = 64

n        = 64        = 0100 0000
n - 1 = 63 = 0011 1111

The mask 0b00111111 extracts the last 6 bits of the hash.

index = hash & 0b00111111  // gives value 0–63

Again, all values in range [0, n-1] are possible and evenly spread (if hash is good).

But What If the Low Bits Are Always Zero ?

That only works if your hashCode() provides enough randomness in the low bits used for bucket indexing. But in practice, many bad hashCode() implementations end up producing hashes where the low bits are all zero.

Let’s say your key class does this:

@override
public int hashCode() {
    return id << 4; // shift left by 4 = multiply by 16
}

For id = 1, 2, 3, 4

id   hash     binary (8-bit)      last 4 bits
-- ------ ----------------- ------------
1 16 0001 0000 0000
2 32 0010 0000 0000
3 48 0011 0000 0000
4 64 0100 0000 0000

No matter how many keys you add, the last 4 bits are always zero. So all keys collide in bucket 0.

This is Why Java Does h ^ (h >>> 16)

To avoid this low-bit disaster, Java adds a hash “spreader”:

int h = hashCode();
int mixed = h ^ (h >>> 16);
  • The upper 16 bits of the original h are shifted down into the lower bits
  • Then XOR’d into the original h
  • So even if the original low bits were just zeros, the new low bits will be more unique because they now include info from the upper bits.

Example 1: Well-Distributed Hash: 0x12345678

int h = 0x12345678;

Step 1: Binary representation of h

First, convert 0x12345678 to 32-bit binary:

h = 0x12345678
= 0001 0010 0011 0100 0101 0110 0111 1000

Split into:

  • High 16 bits: 0001 0010 0011 0100 → 0x1234
  • Low 16 bits: 0101 0110 0111 1000 → 0x5678

Step 2: h >>> 16

Logical right shift by 16 bits moves the high 16 bits down into the low 16:

h >>> 16 = 0x00001234
= 0000 0000 0000 0000 0001 0010 0011 0100

Now:

  • High 16 = all zeros
  • Low 16 = 0x1234 = 0001 0010 0011 0100

Step 3: XOR h ^ (h >>> 16)

XOR compares each bit:

  • If bits are same → result 0
  • If bits are different → result 1
h          = 0001 0010 0011 0100 0101 0110 0111 1000   (0x12345678)
h >>> 16 = 0000 0000 0000 0000 0001 0010 0011 0100 (0x00001234)
-----------------------------------------------------
XOR result = 0001 0010 0011 0100 0100 0100 0100 1100 (0x1234444C)

The high bits remain, but the low 16 bits are now altered.

Example 2 : Poorly Distributed Hash — 0xABCD0000

Now let’s walk through a hash with all-zero low bits, such as what you might get from id << 16.

int h = 0xABCD0000;

Step 1:

h = 0xABCD0000
= 1010 1011 1100 1101 0000 0000 0000 0000
  • High 16 bits = 1010 1011 1100 1101 → 0xABCD
  • Low 16 bits = 0000 0000 0000 0000 → all zeros

Without mixing, the low bits would result in repeated bucket index values usually 0.

Step 2: Logical Right Shift (>>> 16)

h        = 1010 1011 1100 1101 0000 0000 0000 0000
h >>> 16 = 0000 0000 0000 0000 1010 1011 1100 1101
---------------------------------------------------
mixed = 1010 1011 1100 1101 1010 1011 1100 1101
= 0xABCDABCD

Step 3: XOR

h        = 1010 1011 1100 1101 0000 0000 0000 0000
h >>> 16 = 0000 0000 0000 0000 1010 1011 1100 1101
---------------------------------------------------
mixed = 1010 1011 1100 1101 1010 1011 1100 1101
= 0xABCDABCD

Now the low bits are no longer zero.

low 16 = 1010 1011 1100 1101 → 0xABCD

Impact on HashMap Indexing

With tableSize = 16

  • Without mixing:
index = 0xABCD0000 & 0x0F = 0 → always bucket 0
  • With mixing:
index = 0xABCDABCD & 0x0F = 13 →  well-distributed

Why This Matters

Even if your original hashCode() generates predictable or zeroed low bits, Java’s h ^ (h >>> 16) mixing step ensures that the low bits used for indexing still vary, reducing collisions and improving performance.

Demonstration: What Happens When You Skip Mixing

Let’s simulate what happens if your hash values are poorly distributed specifically when the low bits are always zero, like from:

h = id << 16;

We’ll track which bucket each key lands in both with and without mixing.

 import java.util.Arrays;

public class HashMixingHistogram {
static int idxNoMix(int h, int cap) { return (cap - 1) & h; }
static int idxWithMix(int h, int cap) { h ^= (h >>> 16); return (cap - 1) & h; }

public static void main(String[] args) {
int capacity = 64; // power of two
int n = 10_000;

int[] bucketsNoMix = new int[capacity];
int[] bucketsWithMix = new int[capacity];

// Bad hashes: h = id << 16 (low bits zero)
for (int id = 1; id <= n; id++) {
int h = id << 16;

bucketsNoMix[idxNoMix(h, capacity)]++;
bucketsWithMix[idxWithMix(h, capacity)]++;
}

System.out.println("NoMix (expect heavy skew): " + Arrays.toString(bucketsNoMix));
System.out.println("WithMix (expect spread): " + Arrays.toString(bucketsWithMix));
}
}

Output

NoMix (expect heavy skew): [10000, 0, 0, 0, ..., 0]
WithMix (expect spread): [156, 157, 157, 157, ..., 156]

Even though the hashCode() was flawed, the XOR-based mixing still distributed keys across all 64 buckets just as intended.

TL;DR: HashMap relies on low bits for fast indexing via (n — 1) & hash, so it mixes in the upper bits with h ^ (h >>> 16) to ensure even distribution even when hashCode() is poorly implemented.


Some Interesting Things About HashMap Indexing in Java was originally published in Javarevisited on Medium, where people are continuing the conversation by highlighting and responding to this story.

This post first appeared on Read More