How to set up a Pod to run on a specific node

1. Specify the Node by nodeSelector when creating the load

Add a label to the node
1

kubectl label node node2 project=A

Specify the nodeSelector to create the workload

cat <<EOF | kubectl apply -f -

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-nodeselector
spec:
replicas: 1
selector:
    matchLabels:
    app: nginx-nodeselector
template:
    metadata:
    labels:
        app: nginx-nodeselector
    spec:
    nodeSelector:
        project: A
    containers:
    - name: nginx
        image: nginx
EOF

View Workload

kubectl get pod  -o wide

NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE    NOMINATED NODE   READINESS GATES
nginx-nodeselector-7bb75b7687-7r5xk   1/1     Running   0          19s   10.233.96.60    node2   <none>           <none>

As expected, the Pod is running on the specified node node2.

Clean up the environment

1
2

kubectl delete deployments nginx-nodeselector 
kubectl label node node2 project-

In fact, there is another node selection parameter, nodeName, which specifies the node name directly. However, this setting is too rigid and overrides Kubernetes’ own scheduling mechanism, and is rarely used in production.

2. Bind namespaces to nodes via access control

Specifying nodeSelector when creating a load allows you to set the nodes under which the Pod will run. However, if you want to bind all Pods under a namespace to run under a given node, it is not possible. This can be done with kube-apiserver access control, a feature that entered alpha in Kubernetes 1.5.

2.1 Modifying kube-apiserver parameters

Edit the kube-apiserver file:

`1`	`vim /etc/kubernetes/manifests/kube-apiserver.yaml`

Add PodNodeSelector to admission-plugins :

`1`	`- --enable-admission-plugins=NodeRestriction,PodNodeSelector`

Here NodeRestriction is enabled by default. If it is a highly available cluster, then you need to modify each kube-apiserver and wait a while for the kube-apiserver to complete the reboot process.

2.2 Adding annotations to Namespace

Edit the namespace and add annotations:

`1`	`kubectl edit ns default`

apiVersion: v1
kind: Namespace
metadata:
 name: default
 annotations:
   scheduler.alpha.kubernetes.io/node-selector: project=A

scheduler.alpha.kubernetes.io/node-selector can be either a node name or a label key-value pair.

2.3 Adding a specified label to a node

Label the node3 node with project=A.

`1`	`kubectl label node node3 project=A`

Here, the load on namespace default is bound to node node3.

2.4 Creating Loads

Create a load for testing

cat <<EOF | kubectl apply -f -

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-scheduler
spec:
replicas: 3
selector:
    matchLabels:
    app: nginx-scheduler
template:
    metadata:
    labels:
        app: nginx-scheduler
    spec:
    containers:
    - name: nginx
        image: nginx
EOF

View Load Distribution

kubectl get pod -o wide

NAME                               READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
nginx-scheduler-6478998698-brkzn   1/1     Running   0          84s   10.233.92.52   node3   <none>           <none>
nginx-scheduler-6478998698-m422x   1/1     Running   0          84s   10.233.92.51   node3   <none>           <none>
nginx-scheduler-6478998698-mnf4d   1/1     Running   0          84s   10.233.92.50   node3   <none>           <none>

As you can see, although there are 4 available nodes on the cluster, the load under the default space is running under the node3 node.

2.5 Cleaning up the environment

Cleanup label
1

kubectl label node node3 project-

Clearing the load

`1`	`kubectl delete deployments nginx-scheduler`

Cleanup Notes
1

kubectl edit ns default

Note that if the namespace has scheduler.alpha.kubernetes.io/node-selector turned on and the node does not have a tag associated with it, the Pod will remain in the Pending state and will not be scheduled until a node matching the tag is available.

3. Grouping nodes using topology domains

As shown in the figure below, with kube-apiserver’s access control plugin, we can build models with one namespace per project and each namespace contains specified nodes. This meets the requirements of, business isolation and cost billing. However, as the cluster gets larger, the project needs to divide several availability zones under the cluster for securing business availability.

Grouping nodes using topology domains

The topology domain is mainly to solve the problem of Pod distribution in the cluster, and can be used to achieve the demand of Pod to node directional selection. The topology domain feature of Kubernetes Cluster Scheduler entered Alpha phase in 1.16 and Beta phase in 1.18. Here we perform some experiments:

Dividing nodes into different topological domains

Here, node2 is assigned to zone a and node3 and node4 are assigned to zone b.
1

kubectl label node node2 zone=a
1

kubectl label node node3 node4 zone=b

Creating Loads

cat <<EOF | kubectl apply -f -

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-topology
spec:
replicas: 20
selector:
    matchLabels:
    app: nginx-topology
template:
    metadata:
    labels:
        app: nginx-topology
    spec:
    topologySpreadConstraints:
    - maxSkew: 1
        topologyKey: zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
        matchLabels:
            app: nginx-topology
    containers:
    - name: nginx
        image: nginx
EOF

Here topologyKey is used to specify the Key for dividing the topology domain, maxSkew means the difference in the number of Pods in zone=a and zone=b cannot exceed 1, whenUnsatisfiable: DoNotSchedule means no scheduling is done when the condition is not satisfied.

View Pod distribution

kubectl get pod  -o wide

NAME                              READY   STATUS    RESTARTS   AGE    IP              NODE    NOMINATED NODE   READINESS GATES
nginx-topology-7d8698544d-2srcj   1/1     Running   0          3m3s   10.233.92.63    node3   <none>           <none>
nginx-topology-7d8698544d-2wxkp   1/1     Running   0          3m3s   10.233.96.53    node2   <none>           <none>
nginx-topology-7d8698544d-4db5b   1/1     Running   0          3m3s   10.233.105.43   node4   <none>           <none>
nginx-topology-7d8698544d-9tqvn   1/1     Running   0          3m3s   10.233.96.58    node2   <none>           <none>
nginx-topology-7d8698544d-9zll5   1/1     Running   0          3m3s   10.233.105.45   node4   <none>           <none>
nginx-topology-7d8698544d-d6nbm   1/1     Running   0          3m3s   10.233.105.44   node4   <none>           <none>
nginx-topology-7d8698544d-f4nw9   1/1     Running   0          3m3s   10.233.96.54    node2   <none>           <none>
nginx-topology-7d8698544d-ggfgv   1/1     Running   0          3m3s   10.233.92.66    node3   <none>           <none>
nginx-topology-7d8698544d-gj4pg   1/1     Running   0          3m3s   10.233.92.61    node3   <none>           <none>
nginx-topology-7d8698544d-jc2xt   1/1     Running   0          3m3s   10.233.92.62    node3   <none>           <none>
nginx-topology-7d8698544d-jmmcx   1/1     Running   0          3m3s   10.233.96.56    node2   <none>           <none>
nginx-topology-7d8698544d-l45qj   1/1     Running   0          3m3s   10.233.92.65    node3   <none>           <none>
nginx-topology-7d8698544d-lwp8m   1/1     Running   0          3m3s   10.233.92.64    node3   <none>           <none>
nginx-topology-7d8698544d-m65rx   1/1     Running   0          3m3s   10.233.96.57    node2   <none>           <none>
nginx-topology-7d8698544d-pzrzs   1/1     Running   0          3m3s   10.233.96.55    node2   <none>           <none>
nginx-topology-7d8698544d-tslxx   1/1     Running   0          3m3s   10.233.92.60    node3   <none>           <none>
nginx-topology-7d8698544d-v4cqx   1/1     Running   0          3m3s   10.233.96.50    node2   <none>           <none>
nginx-topology-7d8698544d-w4r86   1/1     Running   0          3m3s   10.233.96.52    node2   <none>           <none>
nginx-topology-7d8698544d-wwn95   1/1     Running   0          3m3s   10.233.96.51    node2   <none>           <none>
nginx-topology-7d8698544d-xffpx   1/1     Running   0          3m3s   10.233.96.59    node2   <none>           <none>

There are 10 Pods in node2, 7 nodes in node3, and 3 nodes in node4. You can see that the Pods are evenly distributed on zone=a and zone=b.

Clean up the environment

1
2

kubectl label node node2 node3 node4 zone-
kubectl delete deployments nginx-topology

4. Summary

As clusters get larger, issues such as isolation between businesses and exclusivity of businesses to nodes surface. Usually, each business will have a separate namespace, so we can bind the namespace to the nodes.

This article mainly gives two methods, one is to set nodeSelector directly when creating loads, and the tricky way is to use namespace value as value; the other way is to use the access control plugin provided by kube-apiserver to filter the specified nodes by tag when creating loads under namespaces through annotation to complete the binding between namespaces and nodes. The other way is to use the access control plugin provided by kube-apiserver to filter nodes by tag when creating loads under namespaces.

Consider further that if the number of nodes is very large and we need to divide the available zones to spread the load, then we can do so with the help of topology domains. Through topology domains, we can make the load evenly distributed on the specified availability zones and cabinets according to the configured policies.

Table of Contents

1. Specify the Node by nodeSelector when creating the load

2. Bind namespaces to nodes via access control

2.1 Modifying kube-apiserver parameters

2.2 Adding annotations to Namespace

2.3 Adding a specified label to a node

2.4 Creating Loads

2.5 Cleaning up the environment

3. Grouping nodes using topology domains

4. Summary