Fedora Status Service - SOP

Fedora-Status is the software that generates the page at http://status.fedoraproject.org/. This page should be kept up to date with the current status of the services ran by Fedora Infrastructure.

This page is hosted at AWS.

Contact Information

Owner

Fedora Infrastructure Team

Contact

#fedora-admin, #fedora-noc

Servers

AWS S3/CloudFront

Purpose

Give status information to users about the current status of our public services.

Repository

https://github.com/fedora-infra/statusfpo

How it works

To keep this website as stable as can be, the page is hosted external to our main infrastructure, in AWS.

It is based on an S3 bucket with the files, fronted by a CloudFront distribution for TLS termination and CNAMEs.

The website is statically generated using Pelican on your local machine, and then pushed to S3.

Adding and changing outages

Making Changes

Before pushing changes live to S3, use the Pelican’s devserver to stage and view changes.

  1. Install the packages you need to run the devserver with:

    sudo dnf install pelican python-packaging
  2. Check out the repo at:

    git@github.com:fedora-infra/statusfpo.git
  3. Run the devserver with:

    make devserver
  4. View the generated site at http://0.0.0.0:8000. Note that any changes to the content and theme will automatically regenerate.

  5. Commit changes (or open a Pull Request) to https://github.com/fedora-infra/statusfpo

Create a new outage

  1. Add a markdown file to either content/planned/, content/ongoing, or content/resolved/. The name of the file needs to be unique, so check the resolved outages for an idea on how to name your file.

  2. Add your outage notice to the markdown file, for example:

    Title: Buzzilla Slow
    Date: 2021-04-28 10:22+0000
    OutageFinish: 2021-04-28 13:30+0000
    Ticket: 123456
    
    A swarm of bees have taken up residence in one of
    the Buzzilla Server rooms. Consequently, some
    requests to Buzzilla may respond slower than
    usual. An apiarist has been called to capture
    and relocate the swarm.
    • Note that OutageFinish is optional, but should really only be ommited if the projected / or actual outage time is unknown.

    • When providing dated, keep the timezone offset at +0000 / UTC datetimes

Moving an outage

To move an outage, say from Planned to Ongoing simply move the markdown file into a different status directory in content/, and regenerate.

Publishing

Only members of sysadmin-main and people given the AWS credentials can update the status website.

Initial Configuration for Publishing

  1. First, install the AWS command line tool with:

    sudo dnf install aws-cli
  2. Grab ansible-private/files/aws-status-credentials and store in ~/.aws/credentials.

  3. Run:

    aws configure set preview.cloudfront true

Publishing changes live

Once you are satisfied with your changes and how they look on the devserver, and they have been committed to Git, and push the built changes live with the command:

make upload

Note that this command only updates content changes (i.e. adding / moving outages)

Publishing theme changes

If your changes involve changes to the theme, run the following command to upload everything content and theme changes to the live server:

make upload-theme

Renewing SSL certificate

  1. Run certbot to generate certificate and have it signed by LetsEncrypt (you can run this command anywhere certbot is installed, you can use your laptop or certgetter01.iad2.fedoraproject.org):

    rm -rf ~/certbot
    certbot certonly --agree-tos -m admin@fedoraproject.org --no-eff-email --manual --manual-public-ip-logging-ok -d status.fedoraproject.org -d www.fedorastatus.org --preferred-challenges http-01 --config-dir ~/certbot/conf --work-dir ~/certbot/work --logs-dir ~/certbot/log
  2. You will be asked to make specific file available under specific URL. In a different terminal upload requested file to AWS S3 bucket:

    echo SOME_VALUE >myfile
    aws --profile statusfpo s3 cp myfile s3://status.fedoraproject.org/.well-known/acme-challenge/SOME_FILE
  3. Verify that uploaded file is available under the rigt URL. If previous certificate already expired you may need to run curl with -k option:

    curl -kL http://www.fedorastatus.org/.well-known/acme-challenge/SOME_FILE
  4. After making sure that curl outputs expected value, go back to certbot run and continue by pressing Enter. You will be asked to repeat steps 2 and 3 for another domain. Note that S3 bucket name should stay the same.

  5. Deploy generated certificate to AWS. This requires additional permissions on AWS.