Understanding the Fedora/CentOS bootc filesystem layout

This section briefly re-summarizes some of the content that is in the upstream bootc filesystem documentation. Please consult that for more.

composefs

A first important difference between Fedora/CentOS bootc and other ostree-using variants in Fedora derivatives is that Fedora/CentOS bootc enables the use of composefs for the root filesystem by default.

However, this use of composefs is in an "unsigned" mode; when targeting a filesystem with fs-verity enabled, then fs-verity will be turned on. However, no signature verification is enforced for default installations.

Writable at build time, readonly at runtime

With just the two exceptions below, all directories are writable as part of a container build; but once deployed (onto a physical or virtual machine) bootc defaults to presenting the image content as read-only.

This is similar to a semantic similar to that accessible with podman run --read-only for application container images.

Filesystem bind mount: /etc

The /etc directory is persistent, mutable machine local state by default. While it appears as a mount point, it is always part of the machine’s local root filesystem.

If for example you inject static IP addresses via Anaconda kickstarts, they will persist here across upgrades.

A 3-way merge is applied across upgrades, with each "deployment" having its own copy of /etc.

It is not supported to attempt to make /etc a distinct physical partition. However, it can be made explicitly "transient"; for more see transient-etc example.

Semantics of /etc with container images

When change the contents of /etc in your derived container images, any added or removed files will be applied on upgrades. For example, if you add a new file to /etc/NetworkManager/conf.d/ in your container to configure networking, this will apply on upgrades.

However, according to the 3-way merge logic, any machine-local modified files will "win" by default. This can be a common issue with modifications to /etc/passwd in derived containers. For more on users and groups, see bootc users and groups.

Filesystem bind mount: /var

The /var mount point is also persistent and mutable machine local state, but there is only one physical copy of it. This is the storage location for the following state:

  • Application container image state (/var/lib/containers) used by podman

  • system logs in /var/log

  • User and root home directories (/var/home and /var/roothome respectively)

  • General host-bound application state such as the default /var/lib/postgresql database path

Semantics of /var and container images

It is supported to make the toplevel /var a mount point; however it’s more generally recommended to make sub-paths of /var such as /var/lib/custom-database a mount point instead, or to target a subdirectory of /mnt for example.

In a container build, you can write to `/var`.  However,
this will have a semantic similar to a Dockerfile `VOLUME` instruction;
the content from the container image is *only copied at initial install time*.
Any subsequent system updates will not by default see new changes.

It's recommended instead to use e.g. https://www.freedesktop.org/software/systemd/man/latest/tmpfiles.d.html[systemd tmpfiles.d]
as a way to ensure that newly added state "reconciles" across upgrades
as desired.