Working with Fedora Infrastructure

This document explains how to efficiently work with the Fedora Infrastructure team. Your close attention to this document will help both you and us do the work you are asking us to do.

Our Workflow

Is your issue/problem related to security of an application or service we run?

Emergency/Authentication issues

Is your issue/problem urgent? (An important service is down, you need a change asap) or is your issue/problem such that you cannot file a ticket (authentication, no account, ticketing system down)

  • Login to a matrix account. join the #admin:fedoraproject.org channel. say '!oncall' and explain the issue or problem to the oncall person.

  • If no one is available there:

Ticket tracking

By default, the infrastructure team tracks its work in tickets at: https://pagure.io/fedora-infrastructure/issues/. If you need something from us, please open a new ticket with as much information as you think is needed to process this request.

Once created your ticket will follow the following flow:

750
Figure 2: Daily Process Ticket Flow

A few notes:

  • Make sure to note if there is a deadline or if this issue blocks you.

  • We review tickets during the two stand ups we hold Monday through Thursday (one more Europe timezone friendly and one more US timezone friendly).

  • There is no need to ping team members or notify us about the newly filed ticket.

  • Your ticket will be triaged by a team member and moved to a new state:

    • A Gain and Pain levels will be added to the ticket, these are used by the team member to prioritize their work. (You can find the definition of each level in the glossary.)

    • If it’s moved to Waiting on asignee it’s waiting for a team member to start working on it.

    • If it’s moved to Waiting on reporter it means that you need to answer questions posed in the ticket before it can be worked on.

    • If the ticket is closed with initiative, see New Initiative Workflow.

    • If the ticket is otherwise closed, it will be with a explanation from a team member.

  • If you have an update to your issue/task or want to know when it might be worked on:

    • comment in the ticket adding that information or asking for time frame.

  • When someone is available, your ticket will be assigned to someone to work on.

    • Watch for progress reports/ticket being marked done.

  • If the work is not fully completed as required, please re-open the ticket and indicate this.

    • Go back to the previous step for additional work.

The "Oncall" Role in Our Team

One team member is always designated “oncall”. The assigned person changes every week. You can find who the currently assigned person is on matrix by using !oncall in any of our various matrix channels, such as #admin:fedoraproject.org

When available, this person:

  1. Accepts urgent work items for the team, such as an important or high SLE service being down or causing issues. A ticket should be filed by the reporter or the oncall person to track this work in any case.

  2. Shields other team members from distracting pings and less urgent tickets, deciding when an issue is important enough to interrupt another team member.

  3. Triages incoming tickets for urgent items that need work outside of normal triage process.

Initiatives

All tasks involving new applications, major deployments, major development work or the like will be asked to follow the New Initiative Workflow. It will then be scoped and prioritized from there.

General Ticket Considerations

Please provide as much information as you can in your ticket to avoid back and forth for information. If you know your issue is going to cause a lot of discussion, start a mailing list or discussion thread for that.

Make sure your ticket:

  • Explains the problem or issue you are having, with URLs where possible to the services or applications involved.

  • Tells us how important or urgent this is to you.

  • Includes any error messages or output you see.

It is your responsibility as ticket reporter to follow your ticket, provide information that is asked for, and keep us aware of any urgency you may have. Do not simply file and forget your ticket.

Your ticket may take a while to process, depending on the current workload of the team has and how important we think it is. If your ticket is blocking you, make sure you note that in the ticket, but keep in mind that we may already be working on tickets that are blocking more people.

Every now and then, we will go through our old tickets. When this happen we may ask you to check if the issue still exists (it could be that a complimentary change fixed it, or that was just an intermittent issue or simply that it got fixed without us knowing). In those situation, we kindly ask that you reply to our question/ping within two weeks, otherwise we reserve the right to close the ticket (knowing that you can always re-open it or open a new one if the issue persists or re-appeared).

Matrix

Matrix is a great way to communicate, but please do not ping team members directly. Instead, update your ticket with any new information you have and when the team member(s) working on that issue have time/availability, they may contact you on matrix for more interactive debugging/testing.

Direct emails

E-mail is also a great communication method, but if you mail work items or information to one person directly, they cannot easily hand off the issue, you must wait for them to have time to address the issue (when others could perhaps have already solved it, etc). So, please avoid direct emails and instead update tickets with any information you want to add.

RFC 1149

Pigeons are too slow for most work items, and require facilities (e.g. dovecots) that most team members do not have. Even if the oncall member does have a free dovecot, feed, and is trained in handling carrier pigeons, sending a pigeon to a single team member has the same problems as using matrix or email for the same purpose, which means tickets are still the correct way to report problems.

In other words, please don’t send us any birds.