Modular repository

A standard non-modular RPM repository contains RPM binary files and the repodata directory which is generated by the createrepo_c tool. In a non-modular repository the metadata information is stored inside the RPM binary files themselves. So when running createrepo_c command on a directory with RPM files present, the metadata is extracted from each of the headers of the RPM files and bundled together in the repodata directory.

Modular repository is a standard RPM repository which is extended with modular metadata and modular binary RPM files. A modular repository can consist of non-modular binary RPMs, modular binary RPMs and their corresponding modular metadata. Modular metadata defines a new entity inside a modular repository which is called a module stream. The simplest explanation of a module stream is that a module stream is a group of binary RPM artifacts bound together with extra modular metadata inside a YAML file.

Basic modular repository

Example 1. of a Modular repository with one module stream.
[mcurlej@localhost final_repo]$ tree
.
├── flatpak-rpm-macros-35-4.module_fc35+devel+1234.src.rpm
├── flatpak-rpm-macros-35-4.module_fc35+devel+1234.x86_64.rpm
├── flatpak-runtime-config-35-1.module_fc35+devel+1234.src.rpm
├── flatpak-runtime-config-35-1.module_fc35+devel+1234.x86_64.rpm
├── flatpak-runtime:devel:20220211094818:1234:x86_64.modulemd.yaml (1)
└── repodata
    ├── 3f869b4cf0d94c0d1b246d05b2c538259430bc03df65277a6b94b96cd0f5aa2d-other.xml.gz
    ├── 58a1183d84dc7e319fb365b45c661c6653a530eddf0022a42c5caaf591adb2b8-filelists.sqlite.bz2
    ├── 686ca60dd902af9d4c49968833b3561fe35fd1b8f8f6f722f1a463dfdf559bdc-filelists.xml.gz
    ├── afe6147edfe3577fc76ca94d86960e3e008f3edfa7ac5dca497f56a3ac41b22b-primary.sqlite.bz2
    ├── b68df50bf8d984cdb05fc08d119e72280f7c998486a1ed74c20e632589c00d25-modules.yaml.gz (2)
    ├── d9b1a93ec2353fe6c87795dc59cc7142cf54b89cf762e60bf7f74f7266db75f1-primary.xml.gz
    ├── ef096d0ee7ba59ef7c73c13e38a18c3ead7fef5f21abd115b367e2d9d3c1c39b-other.sqlite.bz2
    └── repomd.xml

1 directory, 13 files
[mcurlej@localhost final_repo]$
1 modulemd YAML file which holds information about one module stream
2 modules.yaml.gz file which holds information about all the module streams in the repository

In the first example we have a simple tree view of a modular repository. The repository contains the RPM packages flatpak-rpm-macros and flatpak-runtime-config both of those are the components of the module stream which is defined in the flatpak-runtime:devel:20220211094818:1234:x86_64.modulemd YAML file. The binary RPM files are also defined in the YAML metadata as the artifacts of the module stream.

A shorthand expression for modular metadata is modulemd

Running createrepo_c on such a directory will generate a repodata directory but with one extra file. The file is *-modules.yaml.gz and holds the YAML modulemd from all the present module stream YAML modulemd files in the directory. The *-modules.yaml.gz describes to DNF how many module streams are present in the modular repository, how they can be installed and which module streams cover which RPM packages. More on module streams can be found here.

Modular binary RPM files can NOT be installed without the corresponding metadata. A module stream inside a repository HAS to always consist of a modulemd YAML file and the modular binary RPM files.

Mixed modular repository

Example 2. of a modular repository with multipe module streams and non-modular packages
[mcurlej@localhost final_repo]$ tree
.
├── flatpak-rpm-macros-35-4.module_fc35+common+1234.src.rpm
├── flatpak-rpm-macros-35-4.module_fc35+common+1234.x86_64.rpm
├── flatpak-rpm-macros-35-4.module_fc35+devel+1234.src.rpm
├── flatpak-rpm-macros-35-4.module_fc35+devel+1234.x86_64.rpm
├── flatpak-runtime:common:20220211094818:1234:x86_64.modulemd.yaml (1)
├── flatpak-runtime-config-35-1.module_fc35+common+1234.src.rpm
├── flatpak-runtime-config-35-1.module_fc35+common+1234.x86_64.rpm
├── flatpak-runtime-config-35-1.module_fc35+devel+1234.src.rpm
├── flatpak-runtime-config-35-1.module_fc35+devel+1234.x86_64.rpm
├── flatpak-runtime:devel:20220211094818:1234:x86_64.modulemd.yaml (2)
├── module-build-0.1.0-1.fc35.noarch.rpm (3)
├── module-build-0.1.0-1.fc35.src.rpm
└── repodata
⋮
1 directory, 20 files
[mcurlej@localhost final_repo]$
1 modulemd YAML file of module stream flatpak-runtime:common
2 modulemd YAML file of module stream flatpak-runtime:devel
3 non-modular package module-build

In the second example we have a modular repository with multiple module streams and a non-modular package. The module streams are always present in the repository but are not enabled by default. Module streams need to be enabled by the user before installation. If a module is not enabled by the user the binary RPM artifacts are not included in the dependency resolution and content set creation transactions which are done by DNF. In our example our modular repository contains the flatpak-runtime module which has the module streams flatpak-runtime:common and flatpak-runtime:devel. Each module stream consists of its modulemd YAML file and corresponding binary RPM files whose filenames are listed in the artifacts section of the modulemd YAML file.

The non-modular package module-build in our modular repository behaves like it would in a non-modular repository. More about how you can enable and install module streams can be found in the section Using modules.

RPMs inside a modular repository are still standard source and binary RPM files. They are still using the NEVRA (name-epoch-version-release-architecture) naming convention. As you can see in our example the RPM files have the same name and version. The unique identifier here is the release (or distag) of the filename. The modular RPM files in the example were built by the tool module-build. The release part is defined by the tool or distribution pipeline you are using to build the RPM files.