Cordoning Nodes and Draining Pods
-
If maintenance is scheduled to be carried out on an Openshift node.
Steps
-
Connect to the
os-control01
host associated with this ENV. Become rootsudo su -
. -
Mark the node as unschedulable:
nodes=$(oc get nodes -o name | sed -E "s/node\///") echo $nodes for node in ${nodes[@]}; do oc adm cordon $node; done node/<node> cordoned
-
Check that the node status is
NotReady,SchedulingDisabled
oc get node <node1> NAME STATUS ROLES AGE VERSION <node1> NotReady,SchedulingDisabled worker 1d v1.18.3
Note: It might not switch to
NotReady
immediately, there maybe many pods still running. -
Evacuate the Pods from worker nodes using one of the following methods This will drain node
<node1>
, delete any local data, and ignore daemonsets, and give a period of 60 seconds for pods to drain gracefully.oc adm drain <node1> --delete-emptydir-data=true --ignore-daemonsets=true --grace-period=15
-
Perform the scheduled maintenance on the node Do what ever is required in the scheduled maintenance window
-
Once the node is ready to be added back into the cluster We must uncordon the node. This allows it to be marked scheduleable once more.
nodes=$(oc get nodes -o name | sed -E "s/node\///") echo $nodes for node in ${nodes[@]}; do oc adm uncordon $node; done
Resources
-
[1] [Nodes - working with nodes](https://docs.openshift.com/container-platform/4.8/nodes/nodes/nodes-nodes-working.html)
Want to help? Learn how to contribute to Fedora Docs ›