Most Spring Boot Developers Can’t Answer These Production Questions. Can You? (Part 2)

Senior interviews aren’t about annotations anymore. They’re about what happens when production gets interesting.

Most Spring Boot interviews start the same way.

What is dependency injection. What is auto-configuration. Difference between @Component and @Bean. What happens inside @Transactional.

Most senior engineers can answer these in seconds. Some can answer them in their sleep.

That’s not what separates senior developers anymore.

The real gap appears when Spring Boot starts behaving differently in production than it did locally.

When latency suddenly spikes with no obvious cause. When logs stop correlating across services. When requests randomly timeout and the CPU looks fine. When retries — added to improve resilience — make the outage worse. When memory grows slowly for six days and then the pod restarts. When one slow downstream call starts exhausting the entire thread pool.

That’s where senior interviews become interesting.

Because production Spring Boot is no longer about annotations. It’s about system behavior under pressure.

The Questions That Actually Reveal Seniority

The best Spring Boot interview questions are no longer “what does this annotation do?”

The better questions are:

  • Why did your interceptor suddenly stop working in async flows?
  • Why are requests timing out even though CPU is low?
  • Why does @Transactional behave differently inside async calls?
  • Why did adding retries make the outage significantly worse?
  • Why is memory growing even though GC is running normally?

These questions reveal whether someone has actually operated Spring Boot systems under production load — or whether they’ve only built them.

Let’s go through five of them.

Question 1: Your Spring Boot Service Has High Latency. CPU Is Fine. What Do You Check First?

Observations:
- p99 latency: climbing steadily
- CPU utilization: 28%
- Error rate: near zero
- Thread pool size: 200 threads
- Downstream API latency: recently changed from 20ms → 2s

This is where most investigations go wrong.

Low CPU tricks teams into thinking the application is healthy. Meanwhile, threads are alive — they’re just not making progress.

Most developers immediately reach for: garbage collection analysis, database query optimization, Kubernetes scaling.

Senior engineers check thread states first.

In IO-heavy Spring Boot systems — which is most of them — threads spend the majority of their time waiting. Waiting for database connections. Waiting for HTTP responses. Waiting behind a lock. Waiting for an overloaded connection pool to hand out a slot.

Here’s the production scenario that plays out:

Tomcat thread pool is configured at 200. The downstream API that was responding in 20ms starts taking 2 seconds. Now every request thread sits blocked for 2 seconds while waiting for that response. New requests arrive, join the queue. Within minutes, all 200 threads are blocked on downstream calls. The queue builds. Latency explodes. CPU stays comfortable because the threads aren’t computing — they’re waiting.

The service is failing without looking busy. That’s the trap.

This is where senior interview discussions move from “do you know Spring Boot?” to “do you understand backpressure?”

Strong answers don’t stop at “scale more pods.” They discuss: taking thread dumps to confirm thread state, identifying connection pool exhaustion, applying bulkhead patterns to isolate downstream calls, configuring explicit timeouts so slow calls fail fast instead of holding threads indefinitely, using circuit breakers to stop sending traffic to already-slow dependencies, and async boundaries to prevent blocking threads from exhausting the entire pool.

The question is not whether you can name Resilience4j. The question is whether you can reason through why your thread pool is draining and what to do about it before the incident escalates.

Question 2: Why Did Your Interceptor Suddenly Stop Working?

The interceptor works perfectly locally. It works in synchronous flows. It passes every integration test.

Then in production, async flows break:

@Component
public class CorrelationInterceptor implements HandlerInterceptor {
@Override
public boolean preHandle(HttpServletRequest request, ...) {
String correlationId = request.getHeader("X-Correlation-ID");
MDC.put("correlationId", correlationId); // stored in ThreadLocal
return true;
}
}
@Service
public class ReportService {
@Async
public CompletableFuture<Report> generateReport() {
// MDC.get("correlationId") returns null here
log.info("Generating report..."); // correlation ID: missing
}
}

Correlation IDs disappear from logs. Authentication context becomes null in async operations. Distributed tracing breaks for any flow that crosses a thread boundary. The interceptor appears to work — until it doesn’t.

The issue is usually not the interceptor. It’s ThreadLocal propagation.

MDC uses ThreadLocal storage. ThreadLocals are thread-scoped. When @Async executes on a different thread from the Spring thread pool, it starts with a fresh, empty context. The correlation ID set by the interceptor never existed on that thread.

This is not a Spring Boot bug. It’s an expected consequence of how thread-local storage works — one that the framework cannot hide from you when you cross thread boundaries.

Senior engineers understand: frameworks abstract behavior. They do not eliminate concurrency boundaries.

Strong answers discuss: implementing a TaskDecorator to copy MDC context when submitting tasks to async executors, using distributed tracing libraries (Micrometer Tracing, OpenTelemetry) that handle context propagation explicitly, understanding the difference between ThreadLocal context (doesn’t propagate) and Reactor Context (designed for propagation in reactive systems), and configuring Spring’s DelegatingSecurityContextAsyncTaskExecutor for security context propagation.

This is execution model knowledge. Not annotation knowledge.

Question 3: Why Did @Transactional Stop Working?

You’ve seen this code before. It looks correct. The annotation is present. No exceptions are thrown. And yet, in production, when something fails mid-operation, the database isn’t rolling back.

@Service
public class PaymentService {
    public void trigger() {
processPayment(); // calling own method
}
    @Transactional
public void processPayment() {
chargeCard();
updateInventory(); // fails here
sendConfirmation(); // never runs
// but chargeCard already committed — no rollback happens
}
}

The annotation exists. The transaction does not.

Spring transactions work through proxy objects. When another bean calls processPayment(), the call routes through Spring’s proxy, which starts a transaction. But when trigger() calls processPayment() inside the same class instance, it calls directly on this — bypassing the proxy entirely. No interception. No transaction.

The dangerous part: the code compiles cleanly, no warnings appear, and under normal conditions (when operations succeed) everything behaves correctly. The transaction is missing only when you need it — during failure. You discover it in production when a partial write leaves inconsistent data with no exception to explain why.

Spring annotations are often proxy behavior disguised as magic. When the proxy is bypassed, the magic disappears silently.

Strong answers cover: why this happens (JDK dynamic proxy vs CGLIB — both are bypassed by self-invocation), the correct fixes (inject self, restructure into separate bean, use AopContext.currentProxy()), and related traps that follow the same pattern: @Async called from within the same class runs synchronously, @Cacheable called from within the same class never caches.

The production risk that makes this truly dangerous: the system appears transactional while silently not being transactional. Intermittent data corruption with no error logs is among the hardest production bugs to diagnose.

Question 4: You Added Retries. Eight Minutes Later the Entire System Went Down. Why?

This is one of the most realistic production questions in senior Spring Boot interviews. Because it has happened to real teams, it involves a completely reasonable decision, and the failure mode is non-obvious until you think through the math.

Scenario:

A downstream API starts timing out. The team adds retry configuration — three retries, exponential backoff, sounds responsible. Traffic is running at 20,000 requests per second.

Eight minutes later, the downstream service collapses completely.

The retry storm:

Normal traffic:           20,000 req/sec to downstream
Each request retries 3x: 20,000 × 3 = 60,000 additional requests
Total downstream load: 80,000 req/sec

The downstream was already struggling at 20,000. It just received four times the load.

The retry mechanism meant to improve resilience became a traffic amplifier. The service that was degraded is now down. The service that added retries is now also failing because its retries aren’t succeeding. Systems that depended on both are now failing too.

Retries are not resilience. Retries without backpressure are a threat multiplier.

This pattern appears constantly in Spring Boot systems through RestTemplate retry configuration, Feign client retries, Resilience4j retry policies, and custom retry wrappers — often added independently by different teams who each made a locally reasonable decision.

Strong answers go beyond “add exponential backoff” to discuss: retry budgets (global limit on what percentage of traffic can be retries), circuit breakers that open before retries compound the problem, jitter to prevent synchronized retry waves, bulkheads that prevent retry pressure from one downstream contaminating thread capacity for others, and coordination — making sure retries are visible across service boundaries, not just locally configured.

Senior engineers think about system recovery behavior under failure. Not just happy-path reliability.

Question 5: Why Is Memory Usage Growing Even Though GC Is Running Normally?

This is the slow-burn production problem that’s hardest to diagnose. No crash. No obvious leak. GC runs. Heap analysis looks normal-ish. And yet, every week, memory climbs until the pod is restarted.

Week 1: JVM heap usage ~  40% — stable
Week 2: JVM heap usage ~ 55% — "probably nothing"
Week 3: JVM heap usage ~ 70% — "maybe we should look at this"
Week 4: JVM heap usage ~ 90% — OOMKilled

The problem is usually not a traditional memory leak. It’s object retention.

Common production causes in Spring Boot applications:

Unbounded caches: @Cacheable with no eviction policy. Every unique cache key adds an entry. In a system with high cardinality (user IDs, session IDs, query parameters), the cache grows without bound.

ThreadLocal leaks: Request-scoped data stored in ThreadLocal without cleanup. Tomcat reuses threads — if a ThreadLocal isn’t cleared after the request, the next request on that thread inherits stale data, and the object is never collected.

Micrometer tag cardinality explosion: A metric tagged with unbounded values (user ID, URL path with path variables, error messages) creates a new Meter object for each unique tag combination. At scale, this is thousands of retained objects per minute.

Hibernate first-level cache in long-running sessions: A batch process or scheduled job that runs a long transaction accumulates every loaded entity in the session cache. For large datasets, this is the heap.

Large CompletableFuture chains: Long chains of async operations that hold references to intermediate results, preventing collection until the entire chain completes.

The diagnosis shift: stop asking “is GC working?” and start asking “why are objects surviving GC?”

GC collects unreachable objects. If objects are reachable — even through a cache, a ThreadLocal, a static map, or a Micrometer registry — they survive collection. The JVM is doing its job. The application is holding references it should have released.

Senior engineers analyze object retention, not just collection rates. That means heap dumps with dominator tree analysis, reviewing cache eviction configuration, auditing ThreadLocal usage for cleanup, and understanding which Spring Boot components maintain long-lived references to application data.

The Pattern Across All Five Questions

Notice what these questions have in common.

None of them test whether you can build a Spring Boot application. None of them care about starter dependencies or auto-configuration internals.

Every question tests the same thing: do you understand what your application is doing when production stops cooperating?

Mid-level interviews focus on: annotations, starters, dependency injection, configuration, happy-path behavior.

Senior interviews focus on: failure modes, runtime behavior under load, concurrency boundaries, observability when things go wrong, degradation patterns, recovery behavior.

Because production systems rarely fail in obvious ways. The dangerous failures are silent:

  • Threads blocked waiting, while CPU reports health
  • Transactions annotated but not applied
  • Context propagated in tests but dropped in production
  • Retries that amplify instead of recover
  • Memory retained instead of collected

The application technically works. Operationally, it’s quietly collapsing.

What This Means

Most developers learn Spring Boot as a framework — a collection of annotations and starters that make building APIs faster.

Senior engineers eventually learn it as a runtime system — something that has execution boundaries, thread models, proxy behaviors, and failure modes that become visible only when the system is under pressure.

The hard part is no longer “what annotation should I use?” The hard part is:

  • What happens under sustained load?
  • What happens when one dependency degrades?
  • What happens across thread boundaries?
  • What happens during retry storms?
  • What happens after six hours of continuous traffic?

Because production Spring Boot is not about features. It’s about behavior. And that’s the part most tutorials never teach.

Prepare for These Conversations

If you’re preparing for senior Spring Boot interviews — or want to practice being on either side of these questions with real engineers — PracHub is worth checking out. It’s built specifically to make technical interviews more transparent, with structured practice for exactly the kind of system behavior and production reasoning questions covered above.

Part 2 of a series on Spring Boot production behavior and senior interview preparation. Part 1 covers @Transactional proxy traps, circular dependencies, auto-configuration internals, and the difference between ApplicationContext and BeanFactory.


Most Spring Boot Developers Can’t Answer These Production Questions. Can You? (Part 2) was originally published in Javarevisited on Medium, where people are continuing the conversation by highlighting and responding to this story.

This post first appeared on Read More