Understanding the Fedora/CentOS bootc filesystem layout

This section briefly re-summarizes some of the content that is in the upstream bootc filesystem documentation. Please consult that for more.

composefs

A first important difference between Fedora/CentOS bootc and other ostree-using variants in Fedora derivatives is that Fedora/CentOS bootc enables the use of composefs for the root filesystem by default.

However, this use of composefs is in an "unsigned" mode; when targeting a filesystem with fs-verity enabled, then fs-verity will be turned on. However, no signature verification is enforced for default installations.

Writable at build time, readonly at runtime

With just the two exceptions below, all directories are writable as part of a container build; but once deployed (onto a physical or virtual machine) bootc defaults to presenting the image content as read-only.

This is similar to a semantic similar to that accessible with podman run --read-only for application container images.

Filesystem bind mount: /etc

The /etc directory is persistent, mutable machine local state by default. While it appears as a mount point, it is always part of the machine’s local root filesystem.

If for example you inject static IP addresses via Anaconda kickstarts, they will persist here across upgrades.

A 3-way merge is applied across upgrades, with each "deployment" having its own copy of /etc.

It is not supported to attempt to make /etc a distinct physical partition. However, it can be made explicitly "transient"; for more see transient-etc example.

Semantics of /etc with container images

When you change the contents of /etc in a derived container images, any added or removed files will be applied on upgrades. For example, if you add a new file to /etc/NetworkManager/conf.d/ in your container to configure networking, this will apply on upgrades.

However, according to the 3-way merge logic, any machine-local modified files will "win" by default. This can be a common issue with modifications to /etc/passwd in derived containers. For more on users and groups, see bootc users and groups.

Filesystem bind mount: /var

The /var mount point is also persistent and mutable machine local state, but there is only one physical copy of it. This is the storage location for the following state:

  • Application container image state (/var/lib/containers) used by podman

  • system logs in /var/log

  • User and root home directories (/var/home and /var/roothome respectively)

  • General host-bound application state such as the default /var/lib/postgresql database path

Semantics of /var and container images

It is supported to make the toplevel /var a mount point; however it’s more generally recommended to make sub-paths of /var such as /var/lib/custom-database a mount point instead, or to target a subdirectory of /mnt for example.

In a container build, you can write to /var. However, this will have a semantic similar to a Dockerfile VOLUME instruction; the content from the container image is only copied at initial install time. Any subsequent system updates will not by default see new changes.

It’s recommended instead to use e.g. systemd tmpfiles.d as a way to ensure that newly added state "reconciles" across upgrades as desired.

Dynamic mountpoints with transient-ro

The transient-ro option allows privileged users to create dynamic top-level mountpoints at runtime while keeping the filesystem read-only by default. This is particularly useful for applications that need to bind mount host paths that may be platform-specific or dynamic.

Use cases

This feature addresses scenarios where:

  • Applications need to bind mount host directories that match the host’s absolute paths

  • Platform-specific mountpoints are required (e.g., /Users on macOS)

  • Dynamic mountpoints need to be created after deployment but before application startup

  • The filesystem should remain read-only for regular processes

Configuration

To enable this feature, add the following to /usr/lib/ostree/prepare-root.conf:

[root]
transient-ro = true
When making changes to filesystem configuration, the initramfs also needs to be regenerated. For more information, see The initial RAM disk (initrd).

Due to a limitation in util-linux, the LIBMOUNT_FORCE_MOUNT2=always environment variable must be set when performing mount operations with this feature. This is an issue (util-linux issue #2283) that affects the mount namespace functionality required by transient-ro.

For a complete end-to-end example demonstrating this feature, see the transient-root-ro example in the examples repository.

How it works

When transient-ro=true is set:

  1. The overlayfs upper directory is mounted read-only by default

  2. Privileged processes can mount it writable only in a new mount namespace, and perform arbitrary changes there, such as creating new toplevel mount points

  3. These mountpoints persist for the current boot but do not survive reboots or upgrades

  4. Regular processes continue to see a read-only filesystem

Example: Podman machine integration

A common use case is with podman machine on macOS, where the VM needs to bind mount host paths like /Users/username into the VM. With transient-ro, the system can:

  1. Create the /Users directory dynamically at runtime

  2. Bind mount the host’s /Users directory to the VM’s /Users

  3. Keep the rest of the filesystem read-only for security

Security considerations

  • Only privileged users (root) can create these mountpoints

  • The mountpoints are transient and don’t persist across reboots

  • The filesystem remains read-only for non-privileged processes

  • This feature should be used judiciously in production environments

For more information about this feature, see the upstream ostree documentation and the related discussion in the bootc project.