Java Garbage Collectors: G1 vs ZGC vs Shenandoah

It is the nightmare scenario for any backend engineer: It is 3:00 AM, and the pager alerts you to a massive spike in API timeouts. The database is healthy, the network is stable, and the CPU usage is erratic. The culprit? A "Stop-the-World" (STW) Garbage Collection pause that froze your application for five seconds to clean up the heap.

Automatic memory management is Java's greatest convenience and, historically, its most unpredictable bottleneck. For years, developers had to over-provision hardware or perform complex tuning gymnastics to keep the Parallel or CMS collectors in check. However, the landscape has shifted dramatically with the Long-Term Support (LTS) releases of JDK 17 and JDK 21. We now have distinct, production-ready choices—G1, ZGC, and Shenandoah—that solve specific architectural problems.

This article dismantles the trade-offs between throughput and latency to help you select the precise collector for your microservices, moving beyond default settings to architectural intent.

Java Garbage Collectors Diagram showing G1, ZGC and Shenandoah
Figure 1: Modern Java Garbage Collectors

The Core Trade-off: Latency vs. Throughput

Before comparing the collectors, we must define the battlefield. In JVM performance tuning, there is an "Impossible Trinity" between three competing goals: low memory footprint, high throughput, and low latency. You generally get to pick two.

Throughput is the percentage of total time the JVM spends executing your application logic (mutator threads) versus time spent performing GC. If your app runs for 100 seconds and spends 1 second in GC, your throughput is 99%.

Latency refers to the responsiveness of the application, specifically the duration of the pauses induced by the GC. A collector might have high throughput (cleaning infrequently) but suffer from high latency (cleaning everything at once, causing a 2-second freeze).

In the context of modern microservices, Tail Latency (p99) often matters significantly more than average throughput. In a distributed system where a single user request might fan out to ten distinct microservices, a high p99 latency in one service can cause cascading timeouts across the entire mesh.

G1 GC: The Balanced Default

Since JDK 9, the Garbage First (G1) collector has been the default, and for good reason. It is designed to be the "Jack of all trades," offering a balance between throughput and latency.

How it Works

Unlike older collectors that physically separated the heap into contiguous generations, G1 partitions the heap into a set of equal-sized "regions." While it still maintains the logical concept of Eden, Survivor, and Old generations, these are physically scattered sets of regions.

The "Garbage First" name stems from its compaction strategy: it tracks the liveness of objects in each region and prioritizes cleaning the regions that contain the most garbage. Crucially, G1 allows you to set a Predictive Pause Target:

-XX:+UseG1GC -XX:MaxGCPauseMillis=200

G1 attempts to meet this 200ms target by adjusting how many regions it collects during a cycle.

Strengths and Weaknesses

G1 is mature, stable, and handles mixed workloads exceptionally well. It prevents full heap fragmentation by compacting regions as it goes. However, it is not a silver bullet. The pause times are not strictly constant; they generally rise as the heap size increases. If you push the heap beyond 32GB or have extremely high allocation rates, G1 may fail to keep up, reverting to a full, single-threaded STW GC.

Best For

G1 is the ideal choice for general-purpose applications with heaps ranging from 4GB to 16GB, where occasional pauses of 200ms to 500ms are acceptable and do not breach SLAs.

ZGC: The Scalable Low-Latency Beast

The Z Garbage Collector (ZGC) represents a paradigm shift. Originally an experimental feature, it became production-ready in JDK 15 and received a massive upgrade in JDK 21 with "Generational ZGC."

Architecture Highlights

ZGC is designed for one thing: consistent, sub-millisecond pause times, regardless of heap size. Whether your heap is 10GB or 10TB, ZGC pauses should remain under 1ms.

It achieves this via two complex mechanisms:

  1. Colored Pointers: ZGC uses metadata bits within the 64-bit object reference pointers themselves to track object states (marked, relocated, etc.).
  2. Load Barriers: Instead of stopping the world to fix references when moving objects, ZGC injects a small logic check (barrier) every time your code reads an object reference. If the object has moved, the barrier fixes the reference immediately ("self-healing").

With JDK 21+, the Generational ZGC (-XX:+UseZGC -XX:+ZGenerational) separates young and old objects, drastically improving CPU efficiency compared to the original non-generational ZGC.

Performance Profile

The trade-off here is CPU usage. Because of the load barriers and concurrent background threads, ZGC has a higher throughput overhead than G1. Your application might run slightly slower (lower operations per second) to guarantee that it never freezes.

Shenandoah: The Concurrent Compactor

Shenandoah, originally developed by Red Hat, shares the same primary goal as ZGC: ultra-low latency via concurrent compaction. However, its implementation differs significantly.

Under the Hood

Like G1, Shenandoah is region-based. Like ZGC, it performs the heavy lifting—evacuation (moving objects)—while your Java threads are still running. To manage this safely, Shenandoah historically used Brooks Pointers. This involves adding a "forwarding pointer" to the header of every object. When an object is moved, the old location points to the new location.

To enable Shenandoah:

-XX:+UseShenandoahGC

ZGC vs. Shenandoah

While both strive for sub-millisecond pauses, they have different historical strengths. Shenandoah was often favored for smaller-to-medium heaps where the overhead of ZGC's colored pointers (which require large address spaces) was unnecessary. However, with recent updates, both collectors are converging in capability. Shenandoah creates a different CPU load profile due to its write-barrier-heavy approach, whereas ZGC relies on read barriers.

Decision Matrix: Which GC for Microservices?

Choosing a GC isn't just about heap size; it's about your application's role in the architecture and the constraints of your container environment (Kubernetes).

The Container Constraint

Be aware of native memory usage. Both ZGC and Shenandoah require off-heap memory structures to manage their concurrent state. If you are running a pod with limit: 2Gi, the overhead of a concurrent collector might trigger an OOMKill where G1 would have survived.

Scenario Selection

  1. High-Throughput Batch Processing: If you are processing massive CSVs or ETL jobs where response time doesn't matter, stick with G1 or even the Parallel GC. You want raw throughput, not low latency.
  2. User-Facing REST APIs: For services backing a UI or synchronous microservice chains requiring strict SLAs (e.g., <50ms response), ZGC (JDK 21+) or Shenandoah are the game changers. The CPU cost is worth eliminating tail latency.
  3. Small Heaps (<2GB): For small microservices, the overhead of concurrent collectors (barriers and extra threads) often outweighs the benefits. G1 is usually the safest and most efficient choice here.

Tuning Advice

Don't tune until you measure. Enable JDK Flight Recorder (JFR) to visualize actual pause times. If you see G1 failing to meet MaxGCPauseMillis consistently, then switch to ZGC.

Conclusion

The era of "GC tuning dark arts" is fading. With JDK 21, the decision tree is clearer than ever. G1 remains the robust default that balances performance and footprint for the majority of applications. ZGC and Shenandoah act as specialized tools for latency-sensitive workloads, virtually eliminating Stop-the-World pauses at the cost of marginally higher CPU consumption.

Remember the hardware reality: Low latency is expensive. Concurrent collectors require CPU cycles to perform barriers and background maintenance. Ensure your Kubernetes limits request enough CPU to handle this overhead.

Final Verdict: Start with the default (G1). Monitor your p99 latency. If—and only if—GC pauses are the bottleneck causing SLA breaches, switch to Generational ZGC or Shenandoah.