Computer scienceSystem administration and DevOpsDockerBuilding images with Dockerfile

Multiple stage building

9 minutes read

In this topic, you will learn about the drawbacks of single-stage building and how multi-stage building can effectively address these limitations. Additionally, you will also look at the benefits of multi-stage building through practical examples while examining approaches to optimize multi-stage builds.

Drawbacks of single-stage build

Single-stage Docker building is the process where you build a docker image using a single Dockerfile instruction. Such a process has several drawbacks that you should consider:

  • Image size: Single-stage Docker building leads to larger image sizes because all dependencies and build tools are included there. All the unnecessary components and bloated images lead to increased storage requirements, slower upload/download speeds, and higher resource consumption during deployment.
  • Build time: Single-stage builds have longer build times, especially for larger projects.
  • Lack of isolation: Single-stage Docker building lacks isolation between build-time and runtime dependencies. In other words, all the dependencies required during the build process are included in the final docker images.
  • Limited flexibility: With single-stage Docker building, there is limited flexibility in choosing different tools or versions for different stages of the build.
  • Debugging and troubleshooting: Debugging issues during the build process are more challenging to overcome with single-stage Docker building.

To overcome these drawbacks, consider using multi-stage Docker builds. Multi-stage builds enable the separation of build-time dependencies from runtime dependencies. The separation helps create smaller and more optimized images. Additionally, multi-stage builds provide the flexibility to customize individual stages, enhance reusability, and reduce build times.

Multi-stage building

Multi-stage building is a technique in Docker that allows you to optimize the build process and create more efficient Docker images. By breaking down the build into multiple stages, you can reduce the final image size and improve performance.

Take a look at an example Dockerfile for a simple web application written in Node.js:

# Stage 1: Building the application
FROM node:14 as builder
WORKDIR /app
COPY package.json .
RUN npm install
COPY . .
RUN npm run build

# Stage 2: Creating the final image
FROM nginx:1.21
COPY --from=builder /app/dist /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

In this example, the Dockerfile consists of two stages. Let's break down each stage and explain how they work independently:

Stage 1: Building the application

  • Start with the node:14 base image and assign it the name builder using the as keyword,
  • Set the working directory inside the container to /app,
  • Copy the package.json file to the working directory,
  • Run npm install to install the dependencies specified in package.json,
  • Copy the entire application code to the working directory,
  • Run npm run build to build the application,

At the end of Stage 1, the application files located in the/app/dist directory are built.

Stage 2: Creating the final image

  • Start with the nginx:1.21 base image,
  • Copy the contents of the /app/dist directory from the previous builder stage into the/usr/share/nginx/html directory of the final image,
  • Use the EXPOSE instruction to expose port 80, allowing it to be accessed from the host or other containers,
  • Finally, set the command to start the NGINX server using CMD instructions.

The COPY --from=builder syntax allows the second stage to access the files generated in the builder stage, such as the built application files. This separation ensures that only the necessary files for running the application are included in the final image and helps build a smaller and more secure production image. It eliminates the inclusion of intermediate build dependencies and tools, which are not needed at runtime.

Optimize multi-stage builds

To optimize the image size, it is important to choose a lightweight and minimal base image like Alpine Linux to remove unnecessary files and dependencies, compress assets, and utilize appropriate base images. This significantly reduces image size while maintaining the required runtime components. Additionally, if you leverage caching mechanisms during the build process, you can further improve build times and efficiency.

Security considerations are also crucial when working with container images. Regularly updating base images and dependencies to include the latest security patches ensures the runtime image is not vulnerable to known security issues. Implementing best practices for container security, such as running containers with minimal privileges and using non-root users, as well as disabling unnecessary services or exposed ports can help limit the attack surface and enhance overall security.

On the other hand, distroless images offer another approach to optimize both image size and security. Distroless images contain only the minimal runtime dependencies required to run an application. It excludes the operating system and any unnecessary packages, libraries, or tools. This minimalist approach enhances security by reducing the potential attack surface and limiting the presence of unused components. However, when using a distroless image, you should carefully package all the required dependencies and ensure you include them in the image. This approach when combined with multi-stage builds helps produce smaller and more secure container images.

Benefits of Multi-stage building

Let's recap the problems that the single-stage building process has and consider how multi-stage building overcomes them.

Smaller image size: In the single-stage Docker building process, the final Docker image often includes unnecessary build-time dependencies and artifacts. This results in larger image sizes and leads to longer deployment times and increased storage requirements. Multi-stage Docker building solves this problem by allowing developers to separate the build environment from the final runtime environment. By discarding unnecessary build artifacts and intermediate layers, the multi-stage building produces smaller, more lightweight Docker images which in turn optimize resource usage and improve overall performance.

Faster build times: Single-stage building process requires executing all build steps from scratch when there are changes in the code or dependencies. Executing every build step even when there are small changes in the code is time-consuming, especially for large or complex projects. Multi-stage builds solve this problem by separating the build artifacts and intermediate layers in the final stage of the multi-stage build process. The smaller and more lightweight Docker images optimize resource usage and improve overall performance. The reduced image size not only speeds up deployment but also reduces the storage requirements, making the distribution and management of Docker images more efficient.

Improved security: In a single-stage build process, the entire build environment, including development tools and dependencies, is typically present in the final Docker image. This can potentially expose sensitive information and create security vulnerabilities. With multi-stage building, developers can use a separate build stage where they install and use the necessary tools and dependencies. In the final stage, only the runtime artifacts and dependencies required for the application are included. This minimizes the attack surface and enhances the security of the Docker image.

Code isolation and reusability: Multi-stage builds enable code isolation within the Docker image build process. You can divide the build stages based on different components or modules of your application and isolate their dependencies and build processes. This isolation promotes code reusability, as common stages or templates can be shared across different projects or components, reducing duplication of effort and ensuring consistency.

Easier debugging and troubleshooting: With multi-stage builds, you can isolate and inspect each stage independently. It becomes easier to troubleshoot issues during the build process as you examine the intermediate layers and outputs of each stage individually. By isolating the stages, you can identify and fix problems more effectively, leading to faster debugging and troubleshooting cycles. This enhanced visibility and granularity help developers understand the build process better and make it easier to identify and resolve any issues that may arise.

Conclusion

Multi-stage docker builds overcome the drawbacks of single-stage builds by reducing image size, improving build time, enhancing security, as well as enabling easier debugging. They offer optimized and efficient docker image creation.

18 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo