← Back to Tracker

Pillar 8 — Networking & APIs

What happens between services — often assumed, rarely known deeply

HTTP & REST Depth Light

HTTP

HTTP/1.1 Keep-Alive #300

Persistent connections that reuse TCP sockets across multiple request-response cycles, avoiding the overhead of repeated handshakes.

  • Connection: keep-alive header enables TCP reuse across sequential requests
  • Reduces latency by eliminating repeated 3-way handshakes per request
  • Head-of-line blocking: one slow response blocks all queued requests on the same connection
  • Timeout and max-requests settings control when the connection is dropped
  • Pipelining was attempted in HTTP/1.1 but largely abandoned due to ordering constraints

HTTP/2 Multiplexing #301

Binary framing layer that allows multiple concurrent streams over a single TCP connection, solving HTTP/1.1's head-of-line blocking at the application layer.

  • Single TCP connection carries many interleaved streams — no blocking between requests
  • Binary framing replaces text parsing, enabling header compression (HPACK)
  • Server push allows preemptive resource delivery before the client requests it
  • Stream prioritization lets clients hint which resources matter most
  • Still suffers from TCP-level head-of-line blocking if a packet is lost

HTTP/3 QUIC #302

HTTP over QUIC replaces TCP with a UDP-based transport that eliminates TCP head-of-line blocking and integrates TLS 1.3 into the handshake.

  • Built on QUIC (UDP) — independent streams mean one lost packet does not block others
  • 0-RTT connection establishment for known servers drastically cuts initial latency
  • TLS 1.3 is mandatory and integrated into the transport handshake
  • Connection migration survives IP changes (e.g., Wi-Fi to mobile) via connection IDs

Status Codes — Semantic Correctness #303

Choosing the right HTTP status code communicates intent precisely. Misused codes confuse clients and break caching, retry logic, and monitoring.

  • 2xx success: 200 OK, 201 Created, 204 No Content — match the action performed
  • 3xx redirection: 301 (permanent) vs 302/307 (temporary) — affects caching and SEO
  • 4xx client errors: 400 (bad input), 401 (unauthenticated), 403 (forbidden), 404 (not found), 409 (conflict), 422 (unprocessable)
  • 5xx server errors: 500 (generic), 502 (bad gateway), 503 (unavailable), 504 (gateway timeout)
  • Never return 200 with an error body — clients and proxies rely on status codes for retry/caching logic

Idempotency & Safe Methods #304

Understanding which HTTP methods are safe (no side effects) and idempotent (repeatable with same result) is critical for retry logic and API design.

  • Safe methods (GET, HEAD, OPTIONS) must not modify server state
  • Idempotent methods (GET, PUT, DELETE) produce the same result if called repeatedly
  • POST is neither safe nor idempotent — use idempotency keys for payment/order APIs
  • PATCH is not inherently idempotent — depends on the operation (e.g., "set" vs "increment")
  • Proxies and CDNs rely on these semantics for caching and automatic retries

Content Negotiation #305

Mechanism by which client and server agree on the format, language, and encoding of a response using request headers.

  • Accept header specifies desired media types (e.g., application/json, application/xml)
  • Accept-Language, Accept-Encoding enable locale and compression negotiation
  • Server returns Content-Type and may reply 406 Not Acceptable if it cannot satisfy
  • Quality values (q=) let clients express preference ordering

ETag & Caching Headers #306

HTTP caching headers control when and how responses are stored and revalidated, reducing bandwidth and server load.

  • ETag is a fingerprint of the response — used with If-None-Match for conditional GETs
  • Cache-Control directives: max-age, no-cache, no-store, public, private, must-revalidate
  • Last-Modified / If-Modified-Since provides time-based conditional requests
  • 304 Not Modified saves bandwidth when the cached version is still valid
  • Stale-while-revalidate allows serving cached content while refreshing in the background

REST Design

Resource Naming Conventions #307

RESTful URLs should represent resources as nouns, use plural forms, and maintain a consistent hierarchy that clients can predict.

  • Use plural nouns: /users, /orders — not /getUser or /createOrder
  • Nest sub-resources: /users/{id}/orders for clear ownership relationships
  • Avoid verbs in paths — let HTTP methods express the action
  • Use kebab-case for multi-word segments: /order-items, not /orderItems
  • Keep URLs shallow — deeply nested paths signal a design smell

Pagination (Cursor vs Offset) #308

Large result sets need pagination. Offset-based is simple but fragile; cursor-based is stable under concurrent writes.

  • Offset pagination (?page=3&size=20): simple but skips/duplicates rows if data changes
  • Cursor pagination (?after=abc123): stable ordering, no skipped items on inserts
  • Keyset pagination uses an indexed column as the cursor for efficient DB queries
  • Include total count, next/prev links, and has_more flag in the response envelope

Versioning Strategies #309

APIs evolve, and versioning determines how breaking changes are introduced without disrupting existing consumers.

  • URI path versioning (/v1/users): most common, explicit, easy to route
  • Header versioning (Accept: application/vnd.api.v2+json): cleaner URLs but harder to discover
  • Query param versioning (?version=2): simple but clutters query strings
  • Additive changes (new fields) usually do not require a version bump
  • Sunset header communicates deprecation timelines to consumers

HATEOAS Concept #310

Hypermedia as the Engine of Application State means responses include links that tell clients what actions are available next, making the API self-discoverable.

  • Each response contains _links or _actions pointing to related resources and transitions
  • Clients do not hard-code URLs — they follow links from responses
  • Rarely implemented in full, but the concept is asked about in interviews
  • HAL, JSON:API, and Siren are popular hypermedia formats

OpenAPI / Swagger #311

OpenAPI Specification (formerly Swagger) provides a machine-readable contract for REST APIs, enabling documentation, code generation, and testing.

  • YAML/JSON schema describes endpoints, request/response models, auth, and errors
  • Swagger UI renders interactive docs directly from the spec
  • Code generators (openapi-generator) produce client SDKs and server stubs
  • Contract-first design: write the spec, then implement — keeps teams aligned
  • springdoc-openapi auto-generates specs from Spring Boot annotations

gRPC & Protobuf Both

Interview tip: gRPC appears in both system-design and coding contexts. Be ready to compare it with REST, explain streaming modes, and discuss when binary serialization matters.

gRPC

Protobuf Binary Encoding #312

Protocol Buffers encode structured data into a compact binary format using field numbers and wire types, drastically reducing payload size compared to JSON.

  • Schema defined in .proto files with typed fields and unique field numbers
  • Binary encoding: field tag (number + wire type) followed by value — no field names on the wire
  • Backward/forward compatible: unknown fields are preserved, missing fields get defaults
  • Varints encode small integers in fewer bytes than fixed-width formats
  • 10-50x smaller and 20-100x faster to parse than JSON for typical payloads

4 Communication Modes #313

gRPC supports four interaction patterns: unary, server streaming, client streaming, and bidirectional streaming, each suited to different real-time data needs.

  • Unary: single request, single response — like a standard REST call
  • Server streaming: client sends one request, server sends a stream of responses (e.g., live feed)
  • Client streaming: client sends a stream of messages, server replies once (e.g., file upload)
  • Bidirectional streaming: both sides send independent streams concurrently (e.g., chat)
  • All modes are multiplexed over a single HTTP/2 connection

Interceptors #314

gRPC interceptors are middleware that wrap calls on the client or server side, enabling cross-cutting concerns like logging, auth, and metrics.

  • Server interceptors process incoming calls before the handler — similar to servlet filters
  • Client interceptors modify outgoing calls — inject auth tokens, trace headers
  • Can be chained: logging → auth → rate-limiting → handler
  • Access to metadata (headers), deadlines, and call context

Deadlines & Cancellation #315

gRPC deadlines propagate across service hops, ensuring that downstream calls are cancelled when the overall time budget expires.

  • Deadline is an absolute timestamp propagated via metadata across the call chain
  • If a deadline expires, both client and server receive DEADLINE_EXCEEDED status
  • Cancellation propagates: if the client cancels, all downstream RPCs are cancelled too
  • Always set deadlines — default is infinite, which risks resource leaks in production
  • Context.current().withDeadlineAfter() in Java sets deadlines programmatically

gRPC vs REST Tradeoffs #316

A frequent interview question: knowing when to choose gRPC over REST and vice versa demonstrates architectural maturity.

  • gRPC: binary, fast, streaming, strongly-typed schema — ideal for internal microservice calls
  • REST: human-readable, browser-friendly, tooling-rich — ideal for public-facing APIs
  • gRPC requires HTTP/2; REST works over HTTP/1.1 and is universally supported
  • Debugging gRPC is harder — binary payloads require grpcurl or similar tools
  • Use gRPC for latency-critical inter-service communication; REST for external consumers

Service Reflection #317

gRPC server reflection exposes the service schema at runtime, enabling dynamic clients and debugging tools to discover methods without the .proto file.

  • Reflection service streams the full FileDescriptorProto to callers
  • grpcurl and Postman use reflection to list services and invoke methods dynamically
  • Should be enabled in dev/staging and disabled in production for security
  • Useful for contract testing and service catalogs in microservice architectures

WebSockets & Real-Time Light

Interview tip: Be able to compare WebSocket, SSE, and long polling side by side. Know when you would pick each and what infrastructure requirements each imposes.

Real-Time Protocols

WebSocket Upgrade Handshake #318

WebSocket connections start as an HTTP/1.1 request with an Upgrade header, switching the protocol to a persistent, full-duplex channel.

  • Client sends GET with Upgrade: websocket and Connection: Upgrade headers
  • Server responds with 101 Switching Protocols if it supports the upgrade
  • Sec-WebSocket-Key / Sec-WebSocket-Accept prevent cross-protocol attacks
  • After the handshake, both sides communicate via frames (text, binary, ping/pong, close)
  • Load balancers must be configured for sticky sessions or WebSocket-aware routing

SSE vs WebSocket Tradeoffs #319

Server-Sent Events (SSE) and WebSockets both enable server push, but differ in directionality, complexity, and browser support.

  • SSE: server-to-client only, simple text/event-stream over HTTP, auto-reconnect built in
  • WebSocket: full-duplex, binary + text, requires explicit reconnection logic
  • SSE works through HTTP proxies and CDNs without special config; WebSocket often needs special support
  • Use SSE for dashboards, notifications, live feeds; WebSocket for chat, gaming, collaborative editing

Long Polling vs SSE #320

Long polling holds an HTTP request open until data arrives, then immediately re-establishes. SSE is a more efficient, standardized alternative.

  • Long polling: client opens request, server holds it until data is ready, then responds; client immediately reconnects
  • Creates overhead from repeated TCP handshakes, headers, and connection teardown
  • SSE keeps one connection open and pushes events as they arrive — far less overhead
  • Long polling is a fallback for environments that do not support SSE or WebSocket

Spring WebSocket + STOMP #321

Spring provides WebSocket support with an optional STOMP sub-protocol for pub/sub messaging patterns with destination-based routing.

  • @EnableWebSocketMessageBroker configures STOMP over WebSocket in Spring Boot
  • STOMP frames (CONNECT, SUBSCRIBE, SEND, MESSAGE) add structured messaging on top of raw WebSocket
  • SimpleBroker for in-memory pub/sub; StompBrokerRelay for external brokers (RabbitMQ, ActiveMQ)
  • SockJS fallback handles browsers/environments without native WebSocket support
  • Destination prefixes (/topic, /queue, /app) route messages to subscribers or handlers

Heartbeat & Reconnection Patterns #322

Persistent connections silently die due to network issues. Heartbeats detect dead connections; reconnection strategies restore them gracefully.

  • Ping/pong frames at regular intervals detect broken connections before data loss
  • STOMP heartbeat negotiation: client and server agree on send/receive intervals
  • Exponential backoff with jitter prevents thundering herd on mass reconnects
  • Client-side state reconciliation after reconnect — request missed events via sequence IDs

TCP/IP Fundamentals Light

Interview tip: TCP questions often surface in system design when discussing connection pooling, latency, and why microservices see occasional slowdowns. Know the handshake, common socket states, and DNS caching.

TCP/IP

3-Way Handshake #323

TCP establishes a reliable connection via a SYN, SYN-ACK, ACK exchange before any data is transferred.

  • Client sends SYN with an initial sequence number (ISN)
  • Server responds with SYN-ACK, acknowledging client's ISN and providing its own
  • Client sends ACK, completing the handshake — data transfer begins
  • Adds one round-trip of latency before any payload — motivates connection pooling
  • SYN flood attacks exploit this by sending SYNs without completing the handshake (mitigated by SYN cookies)

TIME_WAIT & FIN_WAIT #324

TCP socket states during connection teardown. Understanding them explains port exhaustion and connection recycling issues in high-throughput services.

  • FIN_WAIT_1/2: the initiator of close waits for the peer's FIN and ACK
  • TIME_WAIT: persists for 2x MSL (typically 60s) after close to handle late packets
  • Too many TIME_WAIT sockets exhaust ephemeral ports — common in microservices calling many backends
  • SO_REUSEADDR / SO_REUSEPORT and connection pooling mitigate port exhaustion
  • CLOSE_WAIT indicates the application has not called close() — often a resource leak

TCP Backlog & Accept Queue #325

The kernel maintains queues for incoming connections. Understanding these explains why services drop connections under load.

  • SYN queue (half-open): holds connections that received SYN but handshake is incomplete
  • Accept queue (fully established): holds completed connections waiting for accept()
  • Backlog parameter in listen() sets the accept queue size
  • When the accept queue overflows, the kernel drops or RSTs new connections (configurable via sysctl)
  • Tuning: increase somaxconn and application backlog for high-concurrency servers

Nagle's Algorithm #326

Nagle's algorithm reduces small-packet overhead by buffering tiny writes until an ACK arrives or the buffer fills, trading latency for bandwidth efficiency.

  • Buffers small segments and coalesces them into larger TCP segments before sending
  • Interacts poorly with delayed ACKs, causing up to 200ms latency spikes
  • Disabled via TCP_NODELAY socket option for latency-sensitive applications (e.g., gaming, trading)
  • Most HTTP and gRPC libraries set TCP_NODELAY by default

DNS Resolution & TTL #327

DNS translates hostnames to IP addresses. TTL controls caching duration, and misconfigurations can cause stale routing or deployment failures.

  • Resolution chain: local cache → recursive resolver → root → TLD → authoritative nameserver
  • TTL (Time to Live) tells resolvers how long to cache the record before re-querying
  • Low TTL enables fast failover but increases DNS query load; high TTL reduces load but delays updates
  • JVM caches DNS indefinitely by default — set networkaddress.cache.ttl for dynamic environments
  • DNS round-robin provides basic load distribution but no health checking

CDN Routing #328

Content Delivery Networks cache content at edge nodes close to users, using DNS-based or Anycast routing to minimize latency.

  • DNS-based routing: CDN's authoritative DNS returns the nearest edge server IP based on resolver location
  • Anycast routing: multiple edge servers share the same IP; BGP routes to the closest one
  • Cache-Control and Vary headers determine what the CDN caches and for how long
  • Cache invalidation (purge) propagates to all edge nodes — eventual consistency
  • Origin shield reduces origin load by adding a mid-tier cache between edge and origin