One of the key objectives addressed by Kubernetes clusters is the efficient utilization of resources. All resources, such as memory and CPU, are limited and may incur costs. Therefore, Kubernetes introduces essential concepts such as "requests" and "limits", which you will explore in the current topic.
Resource management for pod and containers
Every pod is allocated a specific amount of CPU and memory resources that are utilized by the containers running inside the pod. Since the nodes in the cluster possess limited resources such as CPU and memory, only a limited number of pods can be deployed in the cluster. The Kubernetes scheduler determines the ideal node to place a pod based on available resources. To effectively schedule pods, the scheduler requires information about the resource demands for each pod.
In Kubernetes, configuring resources (such as CPU, memory, and storage) for containers in a pod involves specifying two fields: request and limit. Resource configuration is defined at the container level. You can see it in the container specification section of the pod definition file as shown below.
The request field specifies the minimum amount of resources that a container needs to run. Kube-scheduler uses this information to select which node in the cluster to place the pod on so that it can satisfy all the resource requests of that pod.
The limit field specifies the maximum amount of resources that a pod is allowed to use. With this information, the kubelet ensures that a container doesn't exceed the specified resource limit. It is crucial to note that the limit can never be lower than the request.
Once a pod is scheduled for deployment on a node in a cluster, kubelet reserves at least the requested amount of that node resource specifically for that container to use.
CPU resources are measured in CPU units. In Kubernetes, 1 CPU unit is equivalent to 1 physical CPU core. Fractional requests are also allowed. For example, a request of value 0.5 is equivalent to 500m, read as 500 millicores.
Memory resources are defined in bytes. Normally, the mebibyte value is used for memory, but you can use anything from bytes to petabytes.
Resource configuration example
Take a look at the following configuration file as an example:
apiVersion: v1
kind: Pod
metadata:
name: myapp
spec:
containers:
- name: container1
image: app1
resources:
requests:
memory: "32Mi"
cpu: "200m"
limits:
memory: "64Mi"
cpu: "250m"
- name: container2
image: app2
resources:
requests:
memory: "96Mi"
cpu: "300m"
limits:
memory: "192Mi"
cpu: "750m"You can see that there are two containers in the pod. They are container1 and container2. Each container in the pod can set its own resource config (i.e. resource requests and limits).
The CPU resource request of a pod is the sum of CPU resource requests of all the containers inside that pod. Similarly, the memory resource request of a pod is the sum of memory resource requests of all the containers inside that pod. The resource limits for the pod are calculated in the same way.
So in the example, the pod has a total request of 200m + 300m = 500m CPU and 32Mi + 96Mi = 128 Mi of memory, and a total limit of 250m+750m = 1 CPU unit and 64Mi + 192Mi = 256 Mi of memory.
Pod scheduling
The Kubernetes scheduler picks a node sequentially from a list of available nodes in the Kubernetes cluster to run the pod. It checks if the selected node has sufficient resources to meet the pod's resource requests. If there are not enough resources on the node, the scheduler moves on to the next node in the list. If none of the nodes have the necessary resources, the pod enters into a pending state. It is crucial to note that Kubernetes uses only resource requests (not resource limits) to decide which node to deploy the pod to.
As specifying resource requests for pods is optional, how does Kubernetes schedule the pods when no resource request is specified for any of the containers in the pod? In this scenario, the pod is created on the first available node that possesses all the resources necessary to run the pod. Additionally, Kubernetes categorizes this pod into the BestEffort Quality of Service Class.
"What is BestEffort Quality of Service class?", you may ask. So let's discuss QoS classes in detail.
Quality of Service classes
Sometimes, when an application running in a container exceeds its CPU limits, Kubernetes throttles the CPU usage for this pod. This can degrade the performance of the app. However, Kubernetes will not terminate or remove the container. On the other hand, when a container uses more memory than the specified memory limit, Kubernetes will terminate the container and generate an "Out of memory" error message, or OOM in short. Consequently, Kubernetes will terminate and remove this pod and schedule a new pod to replace it.
The Kubernetes scheduler uses Quality of Service (QoS) classes to determine which pod to prioritize or remove when there is resource contention or node failure. The QoS class of a pod is determined by its resource requests and limits defined in the pod definition.
There are three QoS classes in Kubernetes: Guaranteed, Burstable, and BestEffort.
Pods in the Guaranteed class have both resource requests and limits specified for each of the containers in that pod. Also, for every container in that pod, resource limits must be equal to the resource requests for each of the resources (CPU and memory). These pods have the highest priority which means they are the last ones to be evicted when there is resource contention. They are commonly used for critical production workloads.
Pods in the Burstable class have at least one container that has resource requests or limits specified. These, however, do not require resource limits to be specified to allow them some flexibility as long as resources are available. Their resource limits could also be set as greater than requests. Since pods in this class may include containers with no limits specified, they may strive to use any amount of node resources. Thus, when there is node contention (and all BestEffort pods have been evicted), these pods are next in line for termination and removal. Pods in this class have medium priority and are commonly used for applications that can tolerate occasional resource constraints.
Pods in the BestEffort class have containers that do not have any resource requests or limits specified. Pods in this class can consume any amount of node resources. However, they have the lowest priority. This means they are the first to be evicted when there is node contention. They are commonly used for non-critical workloads or background tasks.
It is important to properly set the QoS class of a pod so that:
Critical workloads receive the necessary resources;
Non-critical workloads do not consume all of the resources or impact the overall cluster performance.
Conclusion
To summarize the topic in short:
Configuring resources for containers in pods is crucial for the proper management of resources in a Kubernetes cluster.
Setting up resources for containers in a pod means defining resource requests (needs) and limits (boundaries).
Kubernetes reserves node resources based on the requests and enforces limits to prevent containers from exceeding their resource limits.
Quality of service (QOS) classes are labels that are attached to pods by Kubernetes depending on their resource configurations.
Kubernetes scheduler uses QoS classes to determine which pod to prioritize or remove when there is resource contention.