Please see the official documentation for more information , this SOP can be used as a rough guide.
Incase an upgrade fails, it is wise to first take an
etcdbackup. To do so follow the SOP .
Ensure that all installed Operators are at the latest versions for their channel .
Ensure that the latest
occlient rpm is available at
batcave01server. Retrieve the RPM from  choose the
Openshift Clients Binaryrpm. Rename rpm to
Ensure that the
sudo rbac-playbook manual/ocp4-sysadmin-openshift.yml -t "upgrade-rpm"playbook is run to install this updated oc client rpm.
At the time of writing the version installed on the cluster is
4.8.11 and the
upgrade channel is set to
stable-4.8. It is easiest to update the cluster via the web console. Go to:
In order to upgrade between
patchversion (x.y.z), when one is available, click the update button.
When moving between
minorversions, you must first switch the
fast-4.9as an example. You should also be on the very latest
patchversion before upgrading.
When the upgrade has finished, switch back to the
upgrade channelfor stable.
In the worst case scenario we may have to restore etcd from the backups taken at the start . Or reinstall a node entirely.
There are many possible ways an upgrade can fail mid way through.
Check the monitoring alerts currently firing, this can often hint towards the problem
Often individual nodes are failing to take the new MachineConfig changes and will show up when examining the
Might require a manual reboot of that particular node
Might require killing pods on that particular node