License audit tooling for Fedora packages
This page describes some of the tools that have been used to audit licensing of packages in Fedora Linux.
The following tools are specifically designed for use with Fedora Linux packages. They use the Fedora License Data as a source of data on valid licenses.
rpmlint is the standard tool used for evaluating Fedora Linux packages for well-known issues for packagers to fix.
In the context of licensing, rpmlint evaluates the
License: field in the spec file and ensures the values conform to the known set of allowed licenses.
This is packaged in Fedora Linux as
rpminspect is the tool used to evaluate Fedora Linux packages for policy compliance, differences as compared to previous builds, and common packaging errors as they are built in the Fedora Build System.
In the context of licensing, rpminspect evaluates the
License: field in RPMs and ensures the values conform to the known set of allowed licenses.
This is packaged in Fedora Linux as
rpminspect. To use it, you need both
The following tools have been used by Fedora Project contributors to analyze the licensing of current and proposed Fedora Linux packages. All of these tools are distribution-agnostic.
Licensecheck is a tool used to analyze the licensing of source files. This tool is principally used in the Fedora context for the initial package review for packages proposed for inclusion in Fedora Linux. Licensecheck is run automatically as part of FedoraReview.
By default, licensecheck provides license reports with full license names, but can be used to produce output using any number of license identifier schemes.
This is packaged in Fedora Linux as
SPDX-license-diff is a Firefox and Chromium/Chrome plugin that takes license text you highlight on a web page and attempts to find close matches to license identifiers or exception identifiers on the SPDX License List. If a match to an SPDX identifier is presented as less than 100%, SPDX-license-diff will display differences between your highlighted text and SPDX’s plain text rendition of the identifier.
SPDX-license-diff will obviously be inconvenient if there is no web interface to the upstream source repository of your package, or your workflow does not involve use of a web browser.
Another limitation of SPDX-license-diff is that it does not fully implement the SPDX matching guidelines. As a result, SPDX-license-diff will typically show textual differences in cases where the highlighted text actually is a match to the SPDX identifier. In cases of close matches, it is generally useful and often necessary to check the XML file for the SPDX identifier in the SPDX license-list-XML repository. For example, many SPDX identifier XML files make use of regular expressions. Bear in mind that the SPDX matching guidelines include rules which are not necessarily reflected in these XML files.
If SPDX-license-diff identifies a license or exception text as a match to an SPDX identifier, you can then use the SPDX identifier to search in the allowed and not-allowed license lists for Fedora.
SPDX Check License is a web application (source code) that displays SPDX License List matches to a license or exception text pasted into a text box. As with SPDX-license-diff, the tool does not fully implement the SPDX matching guidelines. This tool may take more time to give an answer than SPDX-license-diff. It will say whether there is a match, or a close match, to an identifier, but it doesn’t display a diff.
Askalono, packaged in Fedora
askalono-cli, is a simple license scanning tool written in
Rust. It is most useful for quick analysis of packages coming out of
ecosystems featuring projects known to have (1) highly standardized
approaches to layout of license information (it specifically looks
only for files that are named LICENSE or COPYING or some obvious
variant on those), (2) generally simple license makeup, and (3)
cultural preferences for a highly limited set of licenses (for
example, Rust crates that don’t bundle legacy C code, Go modules, or
Node.js npm packages).
Askalono has some significant shortcomings. It can’t recognize or understand: (1) license notices/license texts that are comments in source files, (2) license notices/license texts in README files, (3) license files that contain multiple license texts (or it will only recognize the first of them), and (4) nonstandard/archaic/legacy licenses (which covers most of the licenses being reviewed in issues in fedora-license-data)
FOSSology is a license compliance software system and tooklit that includes license scanning. The information here focuses on that aspect of the toolkit. It can be run locally and also can be set up as a hosted service. See Get Started for ways to install and a link to a test instance that anyone can use.
FOSSology is good for scanning an entire package for licenses or text that looks like licenses. Files can be viewed easily in the FOSSology interface. FOSSology has the ability to remember past license inspection decisions.
Tips on using FOSSology: * In options: #5 - check "Ignore SCM files"; #7 - check Monk, Nomos, Ojo License Analysis and Package Analysis; #8 - check first two options re: "Scanners matches…" * Go to License Browswer view. Look for license matches that are suspicious or unexpected, such as things that are not an SPDX identifier or ambiguous. You can then view the files with those matches and inspect what was found to determine if there is a license that needs to be recorded or if it is a false match. Basic Workflow has some helpful information.
FOSSology is not packaged in Fedora.
ScanCode toolkit is a command-line Python tool and library for detecting licenses, copyrights, package metadata and related information in source and binary code. ScanCode detects licenses by doing a diff against a database of 37,000 license texts and notices.
ScanCode output includes SPDX and a variety of other formats including JSON, YAML, HTML, CSV, CycloneDX and Debian machine readable copyright files.
ScanCode reports detected license information using valid SPDX license expressions (and also ScanCode’s own extended list of license keys that includes proprietary licenses that are not accepted in the SPDX list).
ScanCode does apply the SPDX matching guidelines when possible and does provides strictly SPDX-conformant license expressions in its SPDX output, together with extensive details of the matched texts and license detection details.
ScanCode toolkit is also embedded in other FOSS tools such as FOSSology, tern, ORT and Fosslight.
ScanCode toolkit packaging in Fedora is in progress. It can be installed using pip or an app tarball.
ScanCode.io is a web-based application to run ScanCode toolkit pipelines on multiple projects.
A pipeline is a script to organize the code analysis of large codebases, binaries, containers (or single packages) with optional code matching against an index of pre-existing FOSS code.
ScanCode.io is good for scanning an entire package for licenses notices, license texts and license clues that looks like licenses. Files can be viewed easily in the ScanCode.io interface. See Documentation to get started.
ScanCode.io is not yet packaged in Fedora. It can be installed using containers and podman.
Want to help? Learn how to contribute to Fedora Docs ›