Fedora Infrastructure Nagios

Contact Information


sysadmin-main, sysadmin-noc


#fedora-admin, #fedora-noc




noc01, noc02, noc01.stg, batcave01


This SOP is to describe nagios configurations


Fedora Project runs two nagios instances, nagios (noc01) https://admin.fedoraproject.org/nagios and nagios-external (noc02) https://nagios-external.fedoraproject.org/nagios, you must be in the 'sysadmin' group to access them.

Apart from the two production instances, we are currently running a staging instance for testing-purposes available through SSH at noc01.stg.

nagios (noc01)

The nagios configuration on noc01 should only monitor general host statistics ansible status, uptime, apache status (up/down), SSH etc. + The configurations are found in nagios ansible roles:

nagios-external (noc02)

The nagios configuration on noc02 is located outside of our main datacenter and should monitor our user websites/applications (fedoraproject.org, FAS, PackageDB, Bodhi/Updates). + The configurations are found in nagios ansible roles:

Production and staging instances through SSH: Please make sure you are into 'sysadmin' and 'sysadmin-noc' FAS groups before trying to access these hosts.


We are currently using NRPE to execute remote Nagios plugins on any host of our network.

A great guide about it and its usage mixed up with some nice images about its structure can be found at: https://assets.nagios.com/downloads/nagioscore/docs/nrpe/NRPE.pdf

Understanding the Messages


Nagios notifications are generally easy to read, and follow this consistent format:

** HOST DOWN/UP alert - hostname **

Reading the message will provide extra information on what is wrong.

Disk Space Warning/Critical

Disk space warnings normally include the following information:

DISK WARNING/CRITICAL/OK - free space: mountpoint freespace(MB) (freespace(%) inode=freeinodes(%)):

A message stating "(1% inode=99%)" means that the diskspace is critical not the inode usage and is a sign that more diskspace is required.

Oncall Handling

Anyone who is currently oncall should be able to acknowledge alerts and hosts in Nagios. Therefore, their username should be added to these lines in roles/nagios_server/templtaes/nagios/configs/cgi.cfg.j2: * authorized_for_system_commands * authorized_for_all_service_commands * authorized_for_all_host_commands

It is fine for past oncalls to keep these permissions, so no additional change is needed at the end of their oncall week.

Further Reading