We have all been there. You write a simple application—perhaps a basic microservice or a "Hello World" API—and wrap it in a container. It runs perfectly on your local machine. But when you push it to the registry, you pause in horror at the result: an 800MB image for an application that compiles down to a 10MB binary.
This bloat isn't just an aesthetic annoyance; it is technical debt. Enter Docker Multi-Stage Builds. Introduced in Docker v17.05, this feature allows developers to separate the build environment from the runtime environment within a single Dockerfile. By strictly defining what is required to create your application versus what is required to run it, you can discard the heavy scaffolding used during compilation.
The value proposition is immediate and tangible. Leaner containers mean faster CI/CD pipelines (less time pushing/pulling images), significantly reduced cloud storage costs, and, crucially, a smaller attack surface. In this article, we will dissect the syntax of multi-stage builds, explore advanced caching strategies, and demonstrate how this single optimization allows you to ship more secure software.
The Evolution: From Builder Pattern to Multi-Stage
To understand why multi-stage builds are revolutionary, we must look at the "Old Way" of doing things. Before Docker v17.05, optimization-conscious developers relied on the Builder Pattern. This cumbersome workflow required maintaining two separate Dockerfiles: one for development (containing compilers, source code, and testing tools) and one for production.
You would build the first image, create a container, use a shell script to copy the compiled artifact out of that container to your local filesystem, and then build the second image using that artifact. It was brittle, hard to orchestrate in CI pipelines, and cluttered source control.
Alternatively, many developers simply didn't bother. They shipped images containing GCC, Maven, Gradle, or node_modules full of dev-dependencies to production. This is bad practice. Shipping build tools to production wastes bandwidth and disk space, but more importantly, it provides potential attackers with a toolkit to escalate privileges or compile malicious code if they breach your container.
Docker solved this by allowing multiple FROM instructions in a single Dockerfile. Conceptually, think of this as a relay race. The first stage (the builder) runs the heavy lifting, grabs the baton (the compiled artifact), and passes it to the second stage (the runner). The first stage—and all its heavy layers—is then discarded, leaving only the lean final image.
Anatomy of a Multi-Stage Dockerfile
The magic of multi-stage builds lies in two specific syntax capabilities: aliasing build stages and copying artifacts between them.
1. The Syntax Breakdown
Instead of a standard FROM image, we use FROM image AS alias. This labels the stage so we can reference it later.
2. The Handoff
In the final stage, we use COPY --from=alias to selectively pull files from the previous stage's filesystem into the new one.
Practical Walkthrough: Building a Go Application
Let's look at a concrete example. We want to build a Go binary. The Go compiler is heavy, but the resulting binary is self-contained. Here is how we separate them:
# STAGE 1: The Builder
# We use a full-featured Golang image containing the compiler and tools.
FROM golang:1.21-alpine AS builder
WORKDIR /app
# Copy dependency files and download modules
COPY go.mod go.sum ./
RUN go mod download
# Copy the source code and build the application
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o my-app .
# STAGE 2: The Runtime
# We switch to 'Alpine', a tiny Linux distribution (approx. 5MB).
# Alternatively, we could use 'scratch' (empty image) for even smaller results.
FROM alpine:latest
WORKDIR /root/
# The Magic: Copy ONLY the binary from the 'builder' stage
COPY --from=builder /app/my-app .
# Command to run the executable
CMD ["./my-app"]In this example, the final image contains only the Alpine OS files and the my-app binary. The Go compiler, the source code, and the module cache are all left behind in the discarded builder stage.
Optimizing Layer Caching Strategy
While multi-stage builds reduce size, how you structure your instructions determines build speed. Docker caches layers based on the instructions used to create them. If an instruction changes (or the files it copies change), Docker invalidates that layer and every layer following it.
Ordering Matters
A common mistake is copying all source code before installing dependencies. Consider this inefficient pattern:
# BAD PRACTICE
COPY . .
RUN npm installIn this scenario, every time you change a single line of code in your src folder, Docker sees the COPY . . layer as changed. It consequently invalidates the cache for RUN npm install, forcing a full re-installation of your dependencies. This adds minutes to your CI/CD pipeline unnecessarily.
The Optimized Approach
Always copy your dependency manifests first:
# BEST PRACTICE
COPY package.json package-lock.json ./
RUN npm ci
# Only copy source code AFTER dependencies are installed
COPY . .
RUN npm run buildWith this structure, changing your application code allows Docker to retrieve the npm ci layer from the cache, instantly moving to the build step. In a multi-stage context, this is powerful because the "Builder" stage often retains its cache on the build server, making subsequent builds lightning fast even if the final output is a fresh image.
Security Implications: Minimizing the Attack Surface
Reducing image size is an operational benefit, but the security gains are arguably more critical.
Less is More
Every binary, library, and shell available in your production container constitutes your "attack surface." If an attacker exploits a vulnerability in your application (e.g., via Remote Code Execution), they often look for system tools to expand their foothold. If your container includes gcc, make, wget, or even bash, you have given the attacker weapons to work with.
By using multi-stage builds to copy only the application binary to a lean base image, you remove these tools entirely.
Distroless Images
For the ultimate runtime stage, consider using Google's Distroless images (gcr.io/distroless/static). These images contain only your application and its runtime dependencies. They do not contain package managers, shells, or any other programs you would expect to find in a standard Linux distribution. If an attacker manages to break in, they cannot simply run curl to download a crypto-miner because curl effectively doesn't exist.
Faster, Cleaner Scanning
Modern DevSecOps relies on container scanning tools like Trivy, Snyk, or Clair. Large, monolithic images are noisy; they often trigger alerts for vulnerabilities in build-time tools that aren't actually running in production. By stripping these out via multi-stage builds, your security reports become cleaner, faster to generate, and focused entirely on the relevant runtime risks, helping you meet compliance standards with less friction.
Conclusion: Shipping Better Containers
Mastering Dockerfile best practices is not just about writing code; it is about respecting the deployment lifecycle. Multi-stage builds provide a trifecta of benefits:
- Size: Reducing gigabytes to megabytes, saving storage and bandwidth.
- Speed: Leveraging intelligent layer caching to accelerate CI/CD feedback loops.
- Security: Removing the tools that attackers rely on, hardening your production environment.
I recommend auditing your current Dockerfiles today. Look for RUN apt-get install git gcc lines that persist into the final image. Refactor them into a separate build stage. Finally, run docker images to compare the before and after sizes. The results will speak for themselves.
Building secure, privacy-first tools means staying ahead of security threats. At ToolShelf, all operations happen locally in your browser—your data never leaves your device, providing security through isolation.
Stay secure & happy coding,
— ToolShelf Team