SOP Developing new Zabbix checks on Staging
Contact Information
- Owner
-
Fedora Infrastructure Team
- Contact
-
#fedora-admin, sysadmin-main, sysadmin-noc
- Purpose
-
Provide basic orientation and introduction to developers/maintainers of monitoring checks
Access Level
To perform the configuration described in this SOP, one will need to have either sysadmin-main or sysadmin-noc access. Depending on the checks being deployed, root access to the target hosts may also be needed
Overview
Zabbix works using templates to hold the actual items & triggers, and then the template is applied to the target hosts. Therefore the process looks something like:
-
Create a new Zabbix template (if needed)
-
Add the template to a test host
-
Create items in the template and confirm the data from the host looks correct
-
(Optional but likely) Create triggers for when the item is in a bad state
Process
Creating the template
If you’re adding a new check to an existing service, a template may already exist. To check, in the Zabbix UI, go to Data Collection > Templates in the sidebar. Check to see if a relevant template already exists. If not:
-
Click "Create Template" at the top right
-
In the New Template form, fill out:
-
Name: (something relevant to the service you’re monitoring
-
Template Groups: Fedora
-
Description
-
You can create macros here (this is the Zabbix term for variables) if you want to create overrideable defaults, etc, but this is beyond the scope of this introductory doc.
Add the template to a test host
If creating a new template, go to Data Collection > Hosts, and search for a relevant host:
-
Click it’s name in the results
-
Enter the name of the new template in templates search box (below any existing templates)
-
Click Update at the bottom of the form
If editing an existing template, just double check the template is present on the test host already.
Create new items in the template
This is a wide-ranging topic, as Zabbix supports many types of checks natively, as well as allowing you to run your own scripts, or even auto-discovery things to monitor. Here we’ll look at the first two - a baked-in check, and a custom one.
Firstly, go review [1] which lists the types of items the agent supports out of
the box - this is things like listing processes, checking tcp connections,
stat’ing files systems, and so on. If what you need is there, this is trivial.
We’ll use an example of looking for httpd processes
In either case, we’ll start from Data Collection > Templates, and then click
Items next to your template. Click the Create Item link at the top right to
start.
Adding a default-supported item
In the form, fill out:
-
Name: something relevant to what you’re checking, e.g. "HTTPD processes"
-
Type: "Zabbix agent (active)" - we use active agents and Zabbix makes a distinction
-
Key: "proc.num[httpd]" - this comes from [1], you can use the Select button to get usage hints in the UI
-
Type / Units: optional, set if it makes sense
-
Interval: 1m is the default and fine for easy things. Set it to longer if it makes sense (eg we check certificates every 12h)
-
Description: Add something explain what’s being monitored
The rest can be left as defaults unless you have good reason to change them. Save the item, and then got (perhaps in a new tab) to Monitoring > Latest Data, search for your host and item name, and check the results look correct.
Adding a custom item
This is mostly the same as above, but requries access to the host. You’ll
need to know what commandline you want to run (perhaps a simple ls or
something, or a script you’ve written), and then:
-
Log in as
rootto your target host -
Edit
/etc/zabbix/zabbix_agent.d/myitem.conf(there are other examples in that dir) -
Enter
UserParameter=my.test,/usr/local/bin/zabbix-test.sh-
Change
my.testto something sensible as a key for the item, e.g.postfix.queueetc -
Change
/usr/local/bin/zabbix-test.shto whatever you need to run. You can use pipes and other shell tricks, so/usr/bin/ps ax|/usr/bin/grep -c httpdwould do the same as the check above.
-
-
Save and restart the agent with
systemctl restart zabbix_agent.service
Then return to the section above, create the item the same way - just use your
new key my.test in the Key field. Test the values come through properly - you
may hit issues with SELinux or need to do custom preprocessing, which is beyond
the scope of this doc (but you can ask Infra about it).
Adding triggers to an item
Not every check needs a trigger, some things are just worth having data on. However, we do want to alert when things are wrong.
This is fairly similar to creating items, and again has very wide scope for using many types of statistical checks for trends, comparisons, host history, and so forth. We’ll do a simple "thing > X" type trigger here, but check out [2] for the docs on writing triggers.
Go back to Data Collection > Templates and click Triggers next to your
template. Cleck Create Trigger in the top right to bring up the new trigger
form:
-
Name: something relevant to what’s broken (e.g. "No httpd processes running")
-
Leave Event/Operation Data - these are useful but out of scope for this tutorial
-
Severity: Pick something reasonable
-
note that only "Average" or above is reported to Matrix
-
the lower levels only show on the UI dashboard
-
-
Expression: this is the compelx part, and you can do a lot here. For this check we’ll:
-
Click "Add" to open the helper
-
Click Select next to Item to get a list of items in the template
-
Select our "HTTPD Processes" item
-
Use "last()" as the function (which is the default)
-
Set Result "<" "1"
-
which results in:
last(/My Template/proc.num[httpd])<1
-
-
Check "Allow manual close" as it’s generally useful to allow Infra to close things if we need to
-
Save the Item
Now go test your trigger by (eg) stopping Apache, and checking the trigger fires and shows up in the Dashboard.
Resources
-
[1] Zabbix default item reference: https://www.zabbix.com/documentation/7.4/en/manual/config/items/itemtypes/zabbix_agent
-
[2] Zabbix trigger reference: https://www.zabbix.com/documentation/7.4/en/manual/config/triggers
-
[3] Fedora Stg Zabbix server (log in with SAML): https://zabbix.stg.fedoraproject.org/zabbix.php
Want to help? Learn how to contribute to Fedora Docs ›