Ensuring that container images are as lightweight as possible is a task often postponed until it becomes a significant issue. When this issue arises, it typically involves leaking organizational secrets, reaching registry size limits, or experiencing severely slow startup times for services. I'm here to tell you that we can do better, and it doesn't take a wizard to make this happen.
The Lighter, the Better
- • Faster startup times from container restarts and recycling processes
- • Faster deployment since it's less to upload to any registry
- • Enhanced security since more often than not, application images are deployed holding files that are only build time information
Lead by example
Here is an example of a NodeJS application Dockerfile that contains what I would consider as not ideal:
FROM node:18-slim
WORKDIR /app
ARG NPM_TOKEN_ARG
ENV NPM_TOKEN ${NPM_TOKEN_ARG}
COPY .npmrc package.json package-lock.json .
RUN npm ci
COPY . .
RUN npm run build
CMD ["node", "dist/index.js"]
1. Docker Build Output Shows Secret
Here is an example output when using build-args:
$ docker build -f Dockerfile.not-ideal -t node-app:not-ideal . --build-arg NPM_TOKEN_ARG="my-secret"
[+] Building 5.7s (10/11) docker:desktop-linux
=> [internal] load build definition from Dockerfile.not-ideal 0.0s
The secret token is ends up on the output of the CI run where everybody can see it.
2. Your Environment Secret ain't that Secret
Given the above is setting a build-arg as an environment variable it will also show in the live environment when the app is running. One can simply inspect it by running:
root@8a92d733a82e:/app# printenv | grep -i token
NPM_TOKEN=abc123def456ghi789jkl
3. Shipping node_modules as Part of the Final Production Layer
If you've built any javascript app you probably know that node_modules are quite large and filled with unnecessary dependencies at runtime. In other languages that can be translated to any build time dependencies that are shipped along your container image. A good strategy for that is to use a bundling or compiling solution that can extract those dependencies into one file that is shipped.
4. Source Code Redundancy
Given that we haven't done any clean up in the above there is duplication between the source code and its final distribution folder.
root@8a92d733a82e:/app# ls -a
. .git .npmrc dist node_modules package.json tsconfig.json
.. .gitignore Dockerfile docker-build.sh package-lock.json src
Adopting Best Practices
FROM node:18-slim as base
WORKDIR /app
FROM base as builder
COPY package.json package-lock.json .
RUN npm ci
COPY src/ .
RUN npm run build # outputs a dist/
FROM base as production
COPY /app/dist .
CMD ["node", "dist/index.js"]
1. Striking a Balance: Lean Yet Robust
The final output here is garanteed to be as small as possible. Specially if your application is compiled into a single executable. For example in the JavaScript space, tools like @vercel/ncc and tsup do a great job at bundling the source code optimizing dependency consumption so you don't have to ship those runtime dependencies. Those also work with typescript out of the box.
2. Safeguarding Secrets
We adopt a secure approach to handle secrets by mounting them into the container, ensuring they're not exposed in the container image when running in production.
RUN npm ci
How to Pass Secrets to Build
You can utilize core utility tools like envsubst to inject environment variables into a file in the /tmp directory, passing it as a mounted secret before executing the docker build command.
# docker-build.sh
envsubt < ".npmrc" > "/tmp/.npmrc" &&
docker build --secret id=npmrc,src=/tmp/.npmrc -t node-app
Ignore the Noise
To further optimize your project, utilize a .dockerignore file to ignore files unnecessary for building your Docker image.
A suggested minimum .dockerignore
:
# .dockerignore
.git
node_modules/
dist/
*.md
This will make sure that when docker creates a build context of files in the host machine those listed in the .dockerignore
will not be included.
Smaller Image Size
Let's check out the size difference by running:
$ docker images --filter "reference=node-app"
REPOSITORY TAG IMAGE ID CREATED SIZE
node-app better e753e983bd4e 5 seconds ago 265MB
node-app not-ideal 2ab02c2d1484 15 minutes ago 434MB
Conclusion
Leveraging Docker multi-stage builds is a powerful strategy for segmenting the build process into distinct stages, like base, builder, and production. Implementing these practices allows you to trim down your image size by megabytes, if not gigabytes, and enhance security—providing a significant advantage to any project. Checkout the Github repository I put together for reference.