Docker provides a rich platform for running and deploying many applications. If you want to avoid different kinds of issues in this process you need to follow some practices while creating Dockerfiles. Here, you'll discover some of the important practices that will benefit you. You'll explore how to optimize a frequently used instruction: the RUN instruction. Additionally, you'll learn how to optimize the final image size and use tools that will help you improve your Dockerfile.
Minimize RUN instruction layers
Several instructions add a layer to the image each time the Docker engine executes them. And each layer adds extra size to the final image and can affect the overall performance. One of those instructions is the RUN instruction. To avoid this unwanted increase in size and decrease in performance, you can combine several instructions into one. Let's check this in practice using the following Dockerfile:
FROM ubuntu:22.04
LABEL author=HyperUser
RUN apt-get update -y \
&& apt-get upgrade -y \
&& apt-get install iputils-ping -y \
&& apt-get install net-tools -y
ENTRYPOINT ["/bin/bash"]
Try to avoid the apt-get upgrade command. It can bring upgrades that can affect the environment you use.
Let's build an ubuntu:v1 image based on it and modify it to have separate RUN instructions for each command and build an image again.
FROM ubuntu:22.04
LABEL author=HyperUser
RUN apt-get update -y
RUN apt-get upgrade -y
RUN apt-get install iputils-ping -y
RUN apt-get install net-tools -y
ENTRYPOINT ["/bin/bash"]
Now, let's examine what picture the docker images -a command shows us.
$ docker images --format='{{.Repository}}\t{{.Tag}}\t{{.Size}}' | head -2
ubuntu v2 108MB
ubuntu v1 107MB
In the output below, you can see clearly that the size of the second ubuntu:v2 image with separate RUN instructions is bigger.
Now, let's check the number of intermediate images that were created for these images. Docker generated four intermediate images against two for the first image.
$ docker images -a
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu v2 fcadad99c63f 54 seconds ago 108MB
<none> <none> 5c36555d7e52 55 seconds ago 108MB
<none> <none> 4f5e8cb0f6b3 About a minute ago 107MB
<none> <none> 88f340f849f6 About a minute ago 105MB
<none> <none> 9ec7df14ae44 About a minute ago 105MB
ubuntu v1 da09083f331a 25 minutes ago 107MB
<none> <none> 8412855ec387 25 minutes ago 107MB
<none> <none> c88e31913943 26 minutes ago 70.2MB
ubuntu 22.04 d6547859cd2f 6 weeks ago 70.2MB
So, the image with separate RUN instructions isn't just heavier, Docker also created more intermediate images during its build time consuming more resources.
If you didn't know, Docker creates intermediate images to reduce the build process time. Each of them will be a base for the next one. If you modify any step, it uses the intermediate image from the previous step to avoid building the image from zero.
Choosing the right base image
The basis for any image is its base image. So far, you should have come across examples where Ubuntu is used for this purpose. Ubuntu is known for its user-friendly approach, but its image includes many packages that you may not need. If there is no need to use this particular image, you can switch to another Linux distribution image: the Alpine Linux image. It's based on BusyBox and requires less space because it includes a minimal set of packages and tools. Let's create images with minimal configurations using both distributions and compare their size. First, let's use Ubuntu:
FROM ubuntu:22.04
LABEL author=HyperUser
ENTRYPOINT ["echo", "Hello, Students."]
After this, let's change the base image to Alpine:
FROM alpine:3.17
LABEL author=HyperUser
ENTRYPOINT ["echo", "Hello, Students."]
Now, let's check what we've got. This is what you'll see when you list your images:
$ docker images --format='{{.Repository}}\t{{.Tag}}\t{{.Size}}' | head -2
alpine v1 7.05MB
ubuntu v1 70.2MB
Besides the above-mentioned Linux distro, there are also the so-called distroless and slim images. The first type contains only a certain application and its runtime dependencies. Such images don't provide shells or package managers. Slim images, on the other hand, are OS-based but designed to be as lightweight as possible thanks to different image build optimizations. A common example of distroless or slim images you can find are images for different programming languages.
Removing unnecessary packages
One of the challenges while using Docker is trying to reduce the image size as much as possible. A good practice to reduce the image size is removing apt lists after installing something. Let's work with a Dockerfile used to install network tools. The ubuntu:v1 image size built on it was 107MB. This time, let's apply the same Dockerfile but remove apt lists at the end.
FROM ubuntu:22.04
LABEL author=HyperUser
RUN apt-get update -y \
&& apt-get upgrade -y \
&& apt-get install iputils-ping -y \
&& apt-get install net-tools -y \
&& rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["/bin/bash"]
This time, the image size is much smaller:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu v1 1141a699133b About a minute ago 72.2MB
Another important practice is using the --no-install-recommends flag when installing any package. Many Linux distributions, such as Debian or Ubuntu, use the idea of required and recommended packages. When you install a package the package manager installs not only the requested necessary packages but also recommended packages that aren't mandatory but can be useful. Adding a --no-install-recommends flag tells Docker to install only the required packages and skip installing recommended packages.
FROM ubuntu:22.04
LABEL author=HyperUser
RUN apt-get update -y \
&& apt-get upgrade -y \
&& apt-get install iputils-ping -y --no-install-recommends \
&& apt-get install net-tools -y --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["/bin/bash"]
Docker will download recommended packages but won't install them. This saves time and reduces CPU usage.
Using the exec form
As you might already know, Docker has two ways to execute CMD and ENTRYPOINT instructions. If you don't remember the format of these instructions, it looks like this:
# shell form
CMD echo Hello, Students.
ENTRYPOINT echo Hello, Students.
# exec form
CMD ["echo", "Hello, World."]
ENTRYPOINT ["echo", "Hello, World."]
# exec form, combination of CMD and ENTRYPOINT
ENTRYPOINT ["echo"]
CMD ["Hello"]
Both forms have their advantages and disadvantages. For instance, you can choose the shell form if you need shell processing. If you have an application that must accept different parameters each time you run the container, ENTRYPOINT with exec form can be a good choice.
However, exec is the recommended form. It wins over the shell form due to its performance advantage. exec doesn't start a shell process. It only invokes the required command with arguments and without any other process. Using this form is also preferred from a security perspective as you execute only the specified command.
Using the linter
Creating a Dockerfile by adhering to best practices is a valuable skill. The image's size, building time, and security level depend on how smartly you create your Dockerfile. Fortunately, there is a tool called Hadolint static analyzer that helps you find issues in your Dockerfile and improve it by showing hints. You can either download it from the official GitHub repo or use its online version. To check how it works, let's take the previous Dockerfile sample and try to find its issues.
FROM ubuntu:22.04
LABEL author=HyperUser
RUN apt-get update -y \
&& apt-get upgrade -y \
&& apt-get install iputils-ping -y \
&& apt-get install net-tools -y
ENTRYPOINT ["/bin/bash"]
Here is the result the Hadolint tool shows:
In this result, each message line contains a link with explanations. You can open them and learn more.
Auditing the image for security vulnerabilities
Auditing Docker images for security vulnerabilities is another essential part of working with images. For this purpose, you can apply many techniques, like checking the base image layers or analyzing the image for any unnecessary packages or dependencies. On top of this, Docker provides us with a docker scout command to scan the Dockerfile and image to automatically detect issues. Starting with version 4.17, docker scout is automatically installed when you install Docker Desktop. However, to manually install you can run the following commands:
curl -fsSL https://raw.githubusercontent.com/docker/scout-cli/main/install.sh -o install-scout.sh
sh install-scout.sh
Finally, log in to Docker Hub using Docker Desktop or CLI using the docker login -u <username> command. When you are all set up, run the command docker scout quickview or docker scout cves to scan for vulnerabilities in local or remote images. To scan a local image, run docker scout quickview <image name>.
docker scout quickview provides only an overview of vulnerabilities whereas docker scout cves provides a full list of vulnerabilities.
Let's run the docker scout cves debian command to scan the Debian image. It gives a detailed report of all vulnerabilities in the image:
As you can see, the command provides an overview of vulnerabilities and a breakdown of all the affected packages in the Debian image. You can find more information about this tool on the official Docker documentation website.
Conclusion
In this topic, you learned some best practices for writing a Dockerfile. We discussed reducing the final image layer number, and reducing its size by removing unnecessary packages and about the different image types. We also looked at two tools you can use to analyze and find issues. Let's try to apply this knowledge in practice and improve the quality of your Docker images.