Fedora is applying to be a GSoC mentoring organization.

If you are a student looking forward to participating in Google Summer of Code with Fedora, please feel free to browse this idea list. There may be additional ideas added during the application period.

Now please go read the What Can I do Today section of the main page. This has the answers to your questions and tells you how to apply

Do not hesitate to contact the mentors or contributors listed on this page for any questions or clarification. You can find helpful people on the IRC channel, or use the mailing list. can be used for getting help with programming problems.

Supporting Mentors

The following contributors are available to provide general help and support for the GSoC program If a specific project mentor is busy, you can contact one of the people below for short-term help on your project or task. add yourselves and your wiki page).

  • Sumantro Mukherjee (General development, general Linux,Fedora community, GSoC alumnus, questions about program, misc. advice)

  • Fernando F. Mancera (GSoC, general linux,Fedora community, Mentoring, Networking)

Idea list

Ideas are subject to change as additional mentors are onboarded.

AI-Powered Log Triage and Security Alert Aggregator for Fedora

  • Difficulty : Easy

  • Type : 1 person full time 350hrs (12 weeks)

  • Technology : python, bash, scikit-learn, pytorch, tensorflow, security, AI, LLMs

  • Mentor : Huzaifa Sidhpurwala

  • Email : huzaifas@redhat.com

Description

This project aims to automatically parse, classify, and prioritize security-related logs on a Fedora system. The tool will aggregate logs from multiple sources (e.g., SELinux, systemd journal, audit logs) and apply basic machine learning (ML) or natural language processing (NLP) techniques to identify and prioritize potential security events. It will help administrators quickly spot critical alerts while reducing noise from routine messages.

Deliverables

As a GSoC intern, you will be responsible for the following :

  • Source Code Repository: A publicly accessible GitHub/GitLab project containing all scripts, models, and integration logic.

  • Packaged RPM: A Fedora-compliant RPM package that users can install to deploy the log triage tool.

  • Documentation: Concise instructions covering installation, usage, configuration, and development/contribution guidelines.

  • Demonstration/Prototype: A working setup (CLI or basic UI) showcasing how logs are collected, classified, and prioritized in real time.

  • Testing & Evaluation Results: A set of tests (unit/integration) plus any benchmarking or evaluation reports on model performance and accuracy.


Aggregating Essential SRPM Components Repository for LLM training

  • Difficulty : Easy

  • Type : 1 person full time 350hrs (12 weeks)

  • Technology : Python, bash, RPM, Linux, Git

  • Mentor : Mohammadreza Hendiani

  • Email : man2dev@proton.me

Description

This project aims to develop an automated system that downloads, extracts, and normalizes the essential SRPM components including spec files, patches, and build scripts from multiple upstream repositories. Key sources include official Fedora repositories and RPM Fusion. (Azure Linux RPM, Terra, and COPR repositories will remain optional.) The resulting repository will serve as a high-quality dataset for training or fine-tuning large language models (LLMs) to automatically generate compliant RPM spec files and build instructions. Notably, the actual source archives (e.g. .tar.gz, .tar.bz2, .tar.xz) are omitted, since the spec file provides the necessary URL to download them if required.

Deliverables

  • Research and Requirements Analysis: Survey the structure of SRPMs from Fedora and RPM Fusion and define the schema and normalization rules for spec files, patches and related build scripts.

  • Data Extraction and Normalization: Scripts to download SRPMs, extract and normalize Spec files (.spec), Patch files (.patch) and other build scripts.

  • Repository Aggregation: Consolidate the normalized SRPMs into a mono reposiotry mantaining commit histories and metadata.

  • Testing and Validation: Develop automated tests to verify that all expected components are correctly extracted and normalized.

  • Documentation and Community Outreach: Create comprehensive documentation describing the extraction process, normalization rules and repository structure. Engage with the RPM packaging community to refine the approach and gather feedback.