Computer scienceSystem administration and DevOpsDockerBuilding images with Dockerfile

Best practices for writing Dockerfiles

12 minutes read

Docker provides a rich platform for running and deploying many applications. If you want to avoid different kinds of issues in this process you need to follow some practices while creating Dockerfiles. Here, you'll discover some of the important practices that will benefit you. You'll explore how to optimize a frequently used instruction: the RUN instruction. Additionally, you'll learn how to optimize the final image size and use tools that will help you improve your Dockerfile.

Minimize RUN instruction layers

Several instructions add a layer to the image each time the Docker engine executes them. And each layer adds extra size to the final image and can affect the overall performance. One of those instructions is the RUN instruction. To avoid this unwanted increase in size and decrease in performance, you can combine several instructions into one. Let's check this in practice using the following Dockerfile:

FROM ubuntu:22.04

LABEL author=HyperUser

RUN apt-get update -y \
  && apt-get upgrade -y \
  && apt-get install iputils-ping -y \
  && apt-get install net-tools -y

ENTRYPOINT ["/bin/bash"]

Try to avoid the apt-get upgrade command. It can bring upgrades that can affect the environment you use.

Let's build an ubuntu:v1 image based on it and modify it to have separate RUN instructions for each command and build an image again.

FROM ubuntu:22.04

LABEL author=HyperUser

RUN apt-get update -y 
RUN apt-get upgrade -y
RUN apt-get install iputils-ping -y 
RUN apt-get install net-tools -y

ENTRYPOINT ["/bin/bash"]

Now, let's examine what picture the docker images -a command shows us.

$ docker images --format='{{.Repository}}\t{{.Tag}}\t{{.Size}}' | head -2
ubuntu       v2        108MB
ubuntu       v1        107MB

In the output below, you can see clearly that the size of the second ubuntu:v2 image with separate RUN instructions is bigger.

Now, let's check the number of intermediate images that were created for these images. Docker generated four intermediate images against two for the first image.

$ docker images -a
REPOSITORY   TAG       IMAGE ID       CREATED              SIZE
ubuntu       v2        fcadad99c63f   54 seconds ago       108MB
<none>       <none>    5c36555d7e52   55 seconds ago       108MB
<none>       <none>    4f5e8cb0f6b3   About a minute ago   107MB
<none>       <none>    88f340f849f6   About a minute ago   105MB
<none>       <none>    9ec7df14ae44   About a minute ago   105MB
ubuntu       v1        da09083f331a   25 minutes ago       107MB
<none>       <none>    8412855ec387   25 minutes ago       107MB
<none>       <none>    c88e31913943   26 minutes ago       70.2MB
ubuntu       22.04     d6547859cd2f   6 weeks ago          70.2MB

So, the image with separate RUN instructions isn't just heavier, Docker also created more intermediate images during its build time consuming more resources.

If you didn't know, Docker creates intermediate images to reduce the build process time. Each of them will be a base for the next one. If you modify any step, it uses the intermediate image from the previous step to avoid building the image from zero.

Choosing the right base image

The basis for any image is its base image. So far, you should have come across examples where Ubuntu is used for this purpose. Ubuntu is known for its user-friendly approach, but its image includes many packages that you may not need. If there is no need to use this particular image, you can switch to another Linux distribution image: the Alpine Linux image. It's based on BusyBox and requires less space because it includes a minimal set of packages and tools. Let's create images with minimal configurations using both distributions and compare their size. First, let's use Ubuntu:

FROM ubuntu:22.04

LABEL author=HyperUser

ENTRYPOINT ["echo", "Hello, Students."]

After this, let's change the base image to Alpine:

FROM alpine:3.17

LABEL author=HyperUser

ENTRYPOINT ["echo", "Hello, Students."]

Now, let's check what we've got. This is what you'll see when you list your images:

$ docker images --format='{{.Repository}}\t{{.Tag}}\t{{.Size}}' | head -2
alpine  v1      7.05MB
ubuntu  v1      70.2MB

Besides the above-mentioned Linux distro, there are also the so-called distroless and slim images. The first type contains only a certain application and its runtime dependencies. Such images don't provide shells or package managers. Slim images, on the other hand, are OS-based but designed to be as lightweight as possible thanks to different image build optimizations. A common example of distroless or slim images you can find are images for different programming languages.

Removing unnecessary packages

One of the challenges while using Docker is trying to reduce the image size as much as possible. A good practice to reduce the image size is removing apt lists after installing something. Let's work with a Dockerfile used to install network tools. The ubuntu:v1 image size built on it was 107MB. This time, let's apply the same Dockerfile but remove apt lists at the end.

FROM ubuntu:22.04

LABEL author=HyperUser

RUN apt-get update -y \
  && apt-get upgrade -y \
  && apt-get install iputils-ping -y \
  && apt-get install net-tools -y \
  && rm -rf /var/lib/apt/lists/*

ENTRYPOINT ["/bin/bash"]

This time, the image size is much smaller:

$ docker images 
REPOSITORY   TAG       IMAGE ID       CREATED              SIZE
ubuntu       v1        1141a699133b   About a minute ago   72.2MB

Another important practice is using the --no-install-recommends flag when installing any package. Many Linux distributions, such as Debian or Ubuntu, use the idea of required and recommended packages. When you install a package the package manager installs not only the requested necessary packages but also recommended packages that aren't mandatory but can be useful. Adding a --no-install-recommends flag tells Docker to install only the required packages and skip installing recommended packages.

FROM ubuntu:22.04

LABEL author=HyperUser

RUN apt-get update -y \
  && apt-get upgrade -y \
  && apt-get install iputils-ping -y --no-install-recommends \
  && apt-get install net-tools -y --no-install-recommends \
  && rm -rf /var/lib/apt/lists/*

ENTRYPOINT ["/bin/bash"]

Docker will download recommended packages but won't install them. This saves time and reduces CPU usage.

Using the exec form

As you might already know, Docker has two ways to execute CMD and ENTRYPOINT instructions. If you don't remember the format of these instructions, it looks like this:

# shell form
CMD echo Hello, Students.
ENTRYPOINT echo Hello, Students.

# exec form
CMD ["echo", "Hello, World."]
ENTRYPOINT ["echo", "Hello, World."]

# exec form, combination of CMD and ENTRYPOINT
ENTRYPOINT ["echo"]
CMD ["Hello"]

Both forms have their advantages and disadvantages. For instance, you can choose the shell form if you need shell processing. If you have an application that must accept different parameters each time you run the container, ENTRYPOINT with exec form can be a good choice.

However, exec is the recommended form. It wins over the shell form due to its performance advantage. exec doesn't start a shell process. It only invokes the required command with arguments and without any other process. Using this form is also preferred from a security perspective as you execute only the specified command.

Using the linter

Creating a Dockerfile by adhering to best practices is a valuable skill. The image's size, building time, and security level depend on how smartly you create your Dockerfile. Fortunately, there is a tool called Hadolint static analyzer that helps you find issues in your Dockerfile and improve it by showing hints. You can either download it from the official GitHub repo or use its online version. To check how it works, let's take the previous Dockerfile sample and try to find its issues.

FROM ubuntu:22.04

LABEL author=HyperUser

RUN apt-get update -y \
  && apt-get upgrade -y \
  && apt-get install iputils-ping -y \
  && apt-get install net-tools -y

ENTRYPOINT ["/bin/bash"]

Here is the result the Hadolint tool shows:

Hadolint output showing some recommendations

In this result, each message line contains a link with explanations. You can open them and learn more.

Auditing the image for security vulnerabilities

Auditing Docker images for security vulnerabilities is another essential part of working with images. For this purpose, you can apply many techniques, like checking the base image layers or analyzing the image for any unnecessary packages or dependencies. On top of this, Docker provides us with a docker scout command to scan the Dockerfile and image to automatically detect issues. Starting with version 4.17, docker scout is automatically installed when you install Docker Desktop. However, to manually install you can run the following commands:

curl -fsSL https://raw.githubusercontent.com/docker/scout-cli/main/install.sh -o install-scout.sh
sh install-scout.sh

Finally, log in to Docker Hub using Docker Desktop or CLI using the docker login -u <username> command. When you are all set up, run the command docker scout quickview or docker scout cves to scan for vulnerabilities in local or remote images. To scan a local image, run docker scout quickview <image name>.

docker scout quickview provides only an overview of vulnerabilities whereas docker scout cves provides a full list of vulnerabilities.

Let's run the docker scout cves debian command to scan the Debian image. It gives a detailed report of all vulnerabilities in the image:

Output of running docker scout cves debian

As you can see, the command provides an overview of vulnerabilities and a breakdown of all the affected packages in the Debian image. You can find more information about this tool on the official Docker documentation website.

Conclusion

In this topic, you learned some best practices for writing a Dockerfile. We discussed reducing the final image layer number, and reducing its size by removing unnecessary packages and about the different image types. We also looked at two tools you can use to analyze and find issues. Let's try to apply this knowledge in practice and improve the quality of your Docker images.

33 learners liked this piece of theory. 0 didn't like it. What about you?

Report a typo