Fedora Minimization Objective
Second Phase Proposal
Objective lead: Adam Samalik (asamalik)
Das Problem
Während Fedora gut für traditionelle physische/virtuelle Workstations und Server geeignet ist, wird es oft für Anwendungsfälle jenseits traditioneller Installationen übersehen.
Some modern types of deployments — such as IoT and containers — are quite sensitive to size. For IoT that’s usually slow data connections (for updates/management) and for cloud and containers it’s the massive scale.
Ein konkretes Beispiel ist Systemd: Obwohl es sehr nützlich ist (Systemd ist allgemein beliebt) und auf physischen Systemen immer vorhanden ist, wird es in Containern selten benötigt. Daher war es für Pakete kein Problem, Systemd allein für die Benutzererstellung durch systemd-sysusers zu benötigen. In Containern führt dies jedoch zu einem erheblichen Größenzuwachs.
Darüber hinaus profitieren grundsätzlich alle Arten von Bereitstellungen von einer reduzierten Größe, da ein direkter Zusammenhang zwischen dem Installations-Footprint und der Angriffsfläche sowie den relevanten CVEs besteht.
Vision
Tausende von Einzelpersonen und Unternehmen arbeiten in der Fedora-Community zusammen, um neue Probleme zu erforschen und ein dynamisches, modernes Betriebssystem mit einem umfangreichen Ökosystem zu entwickeln, das es ihnen ermöglicht, mit der Modernisierung ihrer Infrastruktur zu experimentieren.
Mission
Wir helfen Open-Source-Entwicklern, Systemadministratoren und Linux-Distributionsbetreuern, sich auf das zu konzentrieren, was für sie relevant ist.
Ergebnisse
Fedora ist eine beliebte Plattform, weil sein Ökosystem sowohl hochmodern als auch optimal für moderne Anwendungen wie IoT und Container geeignet ist. Daher nutzen viele Anwender Fedora, anstatt eigene Komponenten direkt aus den Upstream-Projekten zu erstellen. Dies entlastet die Open-Source-Entwickler, da Nutzer sonst die schnelle Behebung spezifischer Sicherheits- und anderer Probleme fordern würden.
Daher:
-
Open-Source-Entwickler können sich auf die Funktionsentwicklung konzentrieren
-
Systemadministratoren können problemlos vorgefertigte Komponenten verwenden, die zudem regelmäßig aktualisiert werden
-
Fedora-Mitwirkende (Anbieter und Einzelpersonen) können innerhalb der Fedora-Community zusammenarbeiten, um Open-Source-Lösungen für zukünftige Probleme zu erforschen und zu entwickeln
Ergebnisse
In Fedora werden spezifische Anwendungsfälle definiert. Die Community konzentriert sich anschließend auf diese Anwendungsfälle hinsichtlich Entwicklung und Wartung, Optimierung (z.B. Minimierung) und Tests (z.B. CI und Testumgebungen). Diese Anwendungsfälle können transparent und basierend auf den Interessen der Community für Infrastrukturressourcen priorisiert werden.
Feedback Pipeline actively monitors each use case and records the size and the dependencies required for it to run. Data history is kept and shown to see changes over time. And to keep things small over time, Feedback Pipeline also automatically detects size increases and potentially automatically opens Bugzilla bugs to track/fix/justify such increases transparently.
An active focus on minimization means that our maintainers produce size-optimised content with the same or lower amount of effort. Tooling, services, and data help them to make the right decision about dependencies easily, and to keep things smaller over time.
Aktionen
Identify relevant use cases and allow the community (meaning not just the Minimization Team) to define their own. We think of a use case as a set of packages installed in a specific context, having a specific purpouse — such as Apache HTTP Server Container. Define use cases at least for:
-
httpd
-
nginx
-
MariaDB
-
PostgreSQL
-
Fedora IoT
-
Python 3
Wir ziehen außerdem in Betracht, containerbasierte Anwendungsfälle zu prüfen, wie zum Beispiel:
-
GO für Container-Apps
-
Rust für Container-Apps
-
Quarkus
Collect specific use cases by talking to people at tech events, internet forums, and any other viable venues.
Extend monitoring services (Feedback Pipeline) that:
-
Visualize dependencies and a total size for each use case
-
Monitor size changes over time
-
Auto-detect large size changes
-
Notifies maintainers about unexpected size increases
Other than features, we also need to:
-
write tests to significantly simplify contribution
-
do performance optimizations for the service to scale well
-
explore the use of CI and Rawhide Gating
Being able to see what’s going on is a prerequisite of implementing any changes. Seeing all the relevant opportunities helps us to focus on the ones having the most impact, and a transparent tracking helps us prove the usefulnes of our work, and to further focus on the most impactful activities.
Minimize the installation size of the use cases by optimizing RPM dependencies, features, software architecture, and other factors. Specifically, look for:
-
Unnecessary RPM dependencies (although there probably won’t be many)
-
Multiple implementations of the same functionality required by various packages — try to make them use the same one
-
Context-specific requirements — such as requiring Systemd on traditional deployments being fine vs. requiring it in containers means significant size increases. Leverage weak dependencies in those cases (that might require code changes).
-
Dependencies on large things while only using a fraction of the functionality — such as requiring the whole Perl stack to run a single script — such script can be rewritten to Python which is everywhere mostly because of DNF
Engage with upstream developers regarding bigger changes in packaging and architecture. An example is Systemd and splitting the systemd-sysuser package.
Implement process and policy changes reflecting bigger, more general changes. Again, a good example is using Systemd in containers, or the general issue of creating users in containers.
Provide guidance for the Fedora community in form of blog posts, videos, and conference talks. Even though we might have guidelines and policies in place, spreading the word is always important.
Resources and Inputs
Cloud resources to prototype services. We are not going to change the existing Fedora infrastructure in any way before whatever we develop proves useful and worth the hustle of stabilization and changing production.
No existing Fedora Infra or Release Engineering resources are needed at the moment. However, we might need help with setting up (or getting access to) the cloud resources.
Active support from our maintainers, the FPC, and other community members is definitely needed. This is obviously not something we can "request", but it’s still a necessary input.
Guiding Principles
Usefulness over size: There is a balance between the usefulness and size. We take that in mind and will not implement drastic changes that would prevent our users from using Fedora. However, nothing prevents us from producing additional very specific and minimal artifacts.
Using RPM: We’re doing this with RPM. We’re not achieving minimization by deleting files after installation. This might be obvious, but still worth mentioning.
First Phase Accomplishments
See the status page for detailed info and historic weekly updates. Summary below.
Better understanding — Yes, we now have much better understanding of the problem and a better, more specific idea about the next steps.
Feedback Pipeline — A service that monitors use cases for size and dependencies. Includes various views in tables and interactive dependency graphs.
Systemd and containers — We dag into the issue of Systemd vs. containers, especially for packages requiring it just to create users in containers using systemd-sysuser. Working with upstream on splitting the package out. Thought about, but not yet proposed, a wider policy around this.
Policy thinking:
-
A - If systemd is only needed to start services, a package should only "Recommend" systemd. This will allow containers to install the package without systemd.
-
B - If a program is just using a library of systemd, only require systemd-libs. Example: libusb
-
C - If a package wants to use systemd-sysusers to create users/groups, only require systemd-sysusers. (NOTE: This subpackage isn’t implemented yet)
initial-setup — If an image is built without users, there needs to be some way to add a user at startup. initial-setup does a good job of that, but at the expense of size. It pulls in anaconda-tui and anaconda-core. Those two packages then commence to pull in a lot of other, rather large, packages. This is for the IoT images, as well as others. We currently do not have a recommendation, but it is being worked on.
Use pcre2 instead of pcre — The minimization effort is trying to trim things down to just one pcre, and that is pcre2.
Polkit and mozjs60 — Let’s explain this one with a terrible analogy! Polkit is this lovely person (.5M) that rings your doorbell and says they will wash the windows of your house. After you agree, they bring out their elephant (mozjs60 30M) and use it to spray your windows with water. Polkit pulls in mozjs60, which is a rather large package. So, we’re trying to sort this one out, too.
Want to help? Learn how to contribute to Fedora Docs ›