Skip to content

feat(sbom): build purls with packageurl-python#1032

Open
mprpic wants to merge 1 commit intopython-wheel-build:mainfrom
mprpic:granular-purl-config
Open

feat(sbom): build purls with packageurl-python#1032
mprpic wants to merge 1 commit intopython-wheel-build:mainfrom
mprpic:granular-purl-config

Conversation

@mprpic
Copy link
Copy Markdown
Contributor

@mprpic mprpic commented Apr 7, 2026

Pull Request Description

What

Use the packageurl-python library to construct purls instead of manual string building. Support two modes for purl specification in package settings:

  • Full purl string (purl field) used as-is, e.g. pkg:generic/my-fork@1.0.0
  • Individual field overrides (purl_type, purl_namespace, purl_name, purl_version) that override the default construction from global settings and package identity

The two modes are mutually exclusive, enforced by a model validator. A default purl (pkg:pypi/<name>@<version>) is now always generated. The global purl_type setting (default: pypi) controls the default type for all packages.

Add repository_url to SbomSettings as a global purl qualifier (e.g. ?repository_url=https://packages.redhat.com) that is added to every downstream purl. Per-package repository_url overrides the global value.

Downstream purl construction cascades:
per-package purl (full override) >
per-package field overrides (purl_type, etc.) >
global defaults from SbomSettings

Why

This allows us to define relationships between upstream and downstream components as well as have a better way to control how the downstream purl is constructed (in case our wheel has a different name, version, or other attributes). See also #1031 for more examples.

Closes #1031

@mprpic mprpic requested a review from a team as a code owner April 7, 2026 00:49
@mergify mergify bot added the ci label Apr 7, 2026
@mprpic mprpic force-pushed the granular-purl-config branch from 3b36a55 to c6a45bf Compare April 7, 2026 13:45
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

This pull request adds upstream source package identification and GENERATED_FROM relationships to SBOMs. It introduces a PurlConfig model for per-package purl component overrides, extends SbomSettings with global purl_type and repository_url, and refactors purl construction into _build_downstream_purl and _build_upstream_purl. generate_sbom now emits two SPDX package entries—a downstream wheel (may include repository_url qualifier) and an upstream source (no qualifier)—linked by a GENERATED_FROM relationship. Tests were updated to exercise structured and string purl overrides and repository qualifier behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: building purls using the packageurl-python library instead of manual string construction.
Description check ✅ Passed The description comprehensively explains what the PR does, why it matters, and covers both purl modes, cascade logic, and repository_url handling—all directly related to the changeset.
Linked Issues check ✅ Passed The PR implements all key requirements from #1031: packageurl-python integration, two purl-build functions with cascade logic, upstream/downstream package entries, GENERATED_FROM relationships, and repository_url qualifier support.
Out of Scope Changes check ✅ Passed All changes are scoped to implementing packageurl-python support and the upstream/downstream purl architecture; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fromager/sbom.py`:
- Around line 62-86: The upstream derivation ignores downstream string purl
overrides because pc is None; modify _build_upstream_purl to detect when the
downstream purl was provided as a full string (use the same logic/path as
_build_downstream_purl to obtain the effective downstream purl), parse it into a
PackageURL, and derive the upstream from that parsed PackageURL (preserving
type/namespace/name/version but omitting qualifiers like repository_url) instead
of falling back to sbom_settings defaults; reference _build_upstream_purl,
_build_downstream_purl, pbi.purl_config, and PackageURL to locate where to parse
the downstream string and build the upstream.

In `@tests/test_sbom.py`:
- Around line 90-104: Update the test_generate_sbom_full_purl_override test to
also assert the upstream purl is derived from the override scheme rather than
defaulting to the pypi purl: after calling sbom.generate_sbom and selecting
wheel = doc["packages"][0], add an assertion that the upstream externalRef
referenceLocator equals "pkg:generic/test-pkg@1.0.0" (i.e. the override scheme +
the original package name/version). This targets the behavior in
sbom.generate_sbom when package_overrides contains a full purl string.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 75666aca-27d6-45e9-9049-6096c4edd405

📥 Commits

Reviewing files that changed from the base of the PR and between d86f938 and c6a45bf.

📒 Files selected for processing (7)
  • pyproject.toml
  • src/fromager/packagesettings/__init__.py
  • src/fromager/packagesettings/_models.py
  • src/fromager/packagesettings/_pbi.py
  • src/fromager/sbom.py
  • tests/conftest.py
  • tests/test_sbom.py

…e identification

Use the packageurl-python library to construct purls instead of manual
string building. Introduce a PurlConfig model that consolidates all
purl-related per-package settings into a single field. When set to a
string, it is used as the full downstream purl. When set to a PurlConfig
object, individual fields (type, namespace, name, version,
repository_url, upstream) override specific purl components
while defaulting the rest from global SbomSettings.

Add upstream source identification to the SBOM. Each document now
contains two package entries linked by a GENERATED_FROM relationship:
- SPDXRef-wheel: the downstream wheel with repository_url qualifier
- SPDXRef-upstream: the original source package without qualifiers

The upstream purl is auto-derived by stripping repository_url from the
downstream purl. For packages sourced from GitHub/GitLab, an explicit
upstream purl can be set via PurlConfig.upstream.

Add repository_url to SbomSettings as a global purl qualifier
(e.g. ?repository_url=https://packages.redhat.com) added to every
downstream purl. Per-package PurlConfig.repository_url overrides it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Martin Prpič <mprpic@redhat.com>
@mprpic mprpic force-pushed the granular-purl-config branch from c6a45bf to ad732c7 Compare April 7, 2026 18:28
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/test_sbom.py (1)

116-132: Consider asserting upstream purl for completeness.

This test verifies individual field overrides affect the downstream purl, but doesn't assert the upstream derivation. Adding one line would strengthen coverage:

     wheel = doc["packages"][0]
+    upstream = doc["packages"][1]
     assert wheel["externalRefs"][0]["referenceLocator"] == (
         "pkg:generic/custom-name@1.0.0"
     )
+    assert upstream["externalRefs"][0]["referenceLocator"] == (
+        "pkg:generic/custom-name@1.0.0"
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_sbom.py` around lines 116 - 132, Add an assertion in
test_generate_sbom_purl_field_overrides to also verify the upstream/original
purl is preserved; after obtaining wheel = doc["packages"][0], assert that the
upstream purl equals the expected original purl derived from
Requirement("test-pkg==1.0.0") (e.g., "pkg:pypi/test-pkg@1.0.0") so the test
checks both the overridden externalRefs value and the original upstream purl in
the generated SBOM.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/test_sbom.py`:
- Around line 116-132: Add an assertion in
test_generate_sbom_purl_field_overrides to also verify the upstream/original
purl is preserved; after obtaining wheel = doc["packages"][0], assert that the
upstream purl equals the expected original purl derived from
Requirement("test-pkg==1.0.0") (e.g., "pkg:pypi/test-pkg@1.0.0") so the test
checks both the overridden externalRefs value and the original upstream purl in
the generated SBOM.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bf06ffce-1dcc-4aee-b3df-656a23d42408

📥 Commits

Reviewing files that changed from the base of the PR and between c6a45bf and ad732c7.

📒 Files selected for processing (7)
  • pyproject.toml
  • src/fromager/packagesettings/__init__.py
  • src/fromager/packagesettings/_models.py
  • src/fromager/packagesettings/_pbi.py
  • src/fromager/sbom.py
  • tests/conftest.py
  • tests/test_sbom.py
✅ Files skipped from review due to trivial changes (1)
  • pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/fromager/packagesettings/init.py
  • tests/conftest.py
  • src/fromager/packagesettings/_models.py

@mprpic mprpic requested a review from tiran April 7, 2026 20:14
@mprpic
Copy link
Copy Markdown
Contributor Author

mprpic commented Apr 7, 2026

For markupsafe in builder, this now generates:

{
  "spdxVersion": "SPDX-2.3",
  "dataLicense": "CC0-1.0",
  "SPDXID": "SPDXRef-DOCUMENT",
  "name": "markupsafe-3.0.3",
  "documentNamespace": "https://www.redhat.com/markupsafe-3.0.3.spdx.json",
  "creationInfo": {
    "created": "2026-04-07T20:12:54Z",
    "creators": [
      "Organization: Red Hat",
      "Tool: fromager-0.1.dev2074+gad732c737"
    ]
  },
  "packages": [
    {
      "SPDXID": "SPDXRef-wheel",
      "name": "markupsafe",
      "versionInfo": "3.0.3",
      "downloadLocation": "NOASSERTION",
      "supplier": "Organization: Red Hat",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:pypi/markupsafe@3.0.3?repository_url=https://packages.redhat.com"
        }
      ]
    },
    {
      "SPDXID": "SPDXRef-upstream",
      "name": "markupsafe",
      "versionInfo": "3.0.3",
      "downloadLocation": "NOASSERTION",
      "supplier": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:pypi/markupsafe@3.0.3"
        }
      ]
    }
  ],
  "relationships": [
    {
      "spdxElementId": "SPDXRef-DOCUMENT",
      "relationshipType": "DESCRIBES",
      "relatedSpdxElement": "SPDXRef-wheel"
    },
    {
      "spdxElementId": "SPDXRef-wheel",
      "relationshipType": "GENERATED_FROM",
      "relatedSpdxElement": "SPDXRef-upstream"
    }
  ]
}

with:

diff --git a/overrides/settings.yaml b/overrides/settings.yaml
index b452bc03..c1b75bed 100644
--- a/overrides/settings.yaml
+++ b/overrides/settings.yaml
@@ -1,4 +1,11 @@
 ---
+sbom:
+  supplier: "Organization: Red Hat"
+  namespace: "https://www.redhat.com"
+  creators:
+    - "Organization: Red Hat"
+  repository_url: "https://packages.redhat.com"
+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add upstream source identification and GENERATED_FROM relationship to SBOM

2 participants