Scheduling Pods with Node Selectors, Node Affinity and Tolerations
Kubernetes offers multiple strategies for controlling pod scheduling onto specific nodes. This document outlines the three main approaches: node selectors, node affinity, and tolerations. Understanding these concepts is crucial for optimizing pod placement and ensuring your workloads run on the most appropriate nodes in your cluster.
Option 1 - Node Selector
Node selector provides the simplest way to constrain pods to nodes with specific labels. It works by matching a single key-value pair.
To add a label to a node say node1
with key name
and value node
:
kubectl label nodes node1 name=node1
Common convention is also to use a prefix in the key that indicates the org or project followed by descriptive name. Such as digital.ai/name=node1
. kubernetes.io/hostname
or k8s.io/hostname
reflects the nodes's hostname. And are reserved kubernetes prefixes/keys for system components.
It is not recommended to modify or rely on default labels like kubernetes.io/hostname
since its managed by kubernetes and any manual changes could cause unexpected behavior.
Example:
nodeSelector:
name: "node1"
When to Use:
Use node selector when you have a simple requirement to schedule pods to a specific set of nodes based on a single label. The node labels should have been set already at the cluster level to be used in the kubernetes manifests.
Option 2 - Node Affinity
Node affinity provides a more advanced and flexible way of controlling pod placement. It supports:
- RequiredDuringSchedulingIgnoredDuringExecution: Pods must be placed on nodes matching the specified criteria.
- PreferredDuringSchedulingIgnoredDuringExecution: Pods prefer nodes matching the criteria, but it's not mandatory.
Example:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- "node1"
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: region
operator: In
values:
- "us-east-1"
When to Use:
Use node affinity for complex scheduling requirements, such as matching multiple labels or expressing preferences for nodes.
Option 3 - Tolerations
Tolerations allow pods to run on nodes with specific matching taints. The pods "tolerate" the taints applied to nodes. In other words, the pod will be allowed to be scheduled on nodes that have taint key1=value1:NoSchedule
Example:
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
When to Use:
Use taints on nodes for allowing only specific pods to be scheduled on them. Such as in specific hardware or in a maintenance window.
Example Configuration in CR.YAML
Here's how you can configure node selectors, node affinity and tolerations for Deploy master:
spec:
master:
nodeSelector:
name: "node1"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- "node1"
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: region
operator: In
values:
- "us-east-1"
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
Similarly, you can specify the node selectors, affinity and tolerations for other components such as:
- worker
- centralConfiguration
- postgresql
- rabbitmq
- nginx-ingress-controller
- haproxy-ingress
Typically, only one of these methods is used, although all of them can be used together for more fine-grained control over pod scheduling.