Removing unwanted tarballs from the dist-git rpms

Description

This SOP provides instructions for sysadmin-main level users to remove unwanted tarballs and source RPMs from a branch in the dist-git repositories on Fedora’s infrastructure.

Access level required

  • sysadmin-main

Machine

  • pkgs01.rdu3.fedoraproject.org

Steps to access the machine

  1. SSH into batcave01.rdu3.fedoraproject.org:

       ssh batcave01.rdu3.fedoraproject.org
  2. Switch to the root user:

       sudo su
  3. SSH into pkgs01.rdu3.fedoraproject.org:

       ssh pkgs01.rdu3.fedoraproject.org

Preparing non-bare repository before removing.

Before completely removing the file from dist-git history we need to add it to the lookaside cache. And ensure that the file is presented in .gitignore and sources files.

Adding file into lookaside cache

  1. Clone the package repository:

       fedpkg clone PKG_NAME
  2. Enter the cloned directory:

       cd PKG_NAME
  3. Add the file into lookaside cache:

       fedpkg upload FILE_NAME

Adding file into .gitignore if it’s not already present

Add the file name to .gitignore:

   echo FILE_NAME >> .gitignore

Adding SHA512 hash into sources file if it’s not already present

  1. Generate SHA512 hash:

       sha512sum FILE_NAME >> sources
  2. Copy the hash into sources with specific format:

       echo "SHA512 ($FILE_NAME) = $SHA512_HASH" >> sources
  3. Commit and push all changes above

Removing unwanted tarballs from dist-git

Option 1: Manual

To remove unwanted tarballs and source RPMs from a specific branch of a package in the dist-git repository:

  1. Navigate to the package repository:

       cd /srv/git/repositories/rpms/PKG_NAME.git
  2. Disable warning messages:

       FILTER_BRANCH_SQUELCH_WARNING=1
  3. Rewrite history on the desired branch:

       git filter-branch --force --index-filter 'git rm -r --cached --ignore-unmatch *.src.rpm *.tar.gz' --original refs/archive -- BRANCH

Replace *.src.rpm *.tar.gz with the actual file name, that you would like to remove. If you want to remove file from all branches use --all instead of BRANCH.

  1. Move the backup reference:

       mv refs/archive/refs/heads/BRANCH refs/archive/BRANCH
  2. Remove the redundant directories:

       rm -r refs/archive/refs/

Replace PKG_NAME with the name of the package. Replace BRANCH with the target branch name.

Verification

Verify the unwanted files have been removed from the branch history.

  1. Run Verification

        git for-each-ref --format='%(refname:short)' | while read branch; do
            echo "Checking $branch:"
            git ls-tree -r $branch --name-only | grep -q "$FILE_NAME" && echo "FOUND in $branch" || echo "Not in $branch"
        done

Use actual file name instead of $FILE_NAME.

Option 2: Automated

We will use a script from releng repository to remove unwanted tarballs from dist-git.

  1. Navigate to the root directory for package repositories:

       cd /srv/git/repositories/rpms/PKG_NAME.git
  2. Create tmp directory and cd into it:

       mkdir tmp
       cd tmp
  3. Get the script from releng repository:

       curl -O https://forge.fedoraproject.org/releng/tooling/src/branch/main/packages/maintenance/removing_unwanted_tarball.sh
  4. Make the script executable:

       chmod +x removing_unwanted_tarball.sh

Replace REPO_NAME with the name of the package. It should start with ../ if you are running the scripts from tmp directory.
Replace FILE_NAME with the name of the file you want to remove.
Replace USER_OWNER with the user owner for directories inside the package repository.
It should be done because the scripts could be run by different users and we want ownership to be consistent.
Replace GROUP_OWNER with the group owner for directories inside the package repository.

  1. Run the script:

       ./removing_unwanted_tarball.sh