-
Notifications
You must be signed in to change notification settings - Fork 83
Description
Feature Request
Problem
When a Python package is rebuilt from the same source (e.g., a CI rebuild), the resulting artifact is often byte-for-byte different from the original — resulting in a different sha256 — even though it is functionally identical from pip's perspective (same name and version).
Currently, uploading this rebuilt artifact succeeds and effectively replaces the previous content in the repository. This creates a serious problem:
- Consumers who pinned
--hash=sha256:...(via pip's hash-checking mode or lock files likepip-tools/poetry.lock) will have their installs break silently, because the sha256 they recorded no longer matches what Pulp serves. - It undermines reproducibility guarantees that users depend on.
- It is indistinguishable from a supply chain substitution attack.
PyPI itself prevents this: once a file with a given filename is uploaded, re-uploading a file with the same name but different content is rejected with a 400 error. This is a deliberate security and reproducibility protection.
Proposed Solution
Add a per-repository setting (e.g., prevent_duplicate_filenames or disallow_content_substitution) that, when enabled, causes Pulp to reject any upload where:
- A file with the same filename already exists in the repository, and
- The sha256 of the new upload differs from the existing file.
When the sha256 matches (true duplicate), the upload should succeed idempotently (no-op), consistent with current behavior.
This should be opt-in and configurable at the PythonRepository level so repositories that intentionally allow re-publishing can continue to do so.
Acceptance Criteria
- A new boolean field on
PythonRepositorycontrols this behavior, defaulting toFalsefor backwards compatibility. - When enabled, uploading a file whose filename already exists in the repository with a different sha256 returns an appropriate HTTP error (e.g., 400) with a clear message.
- When enabled, uploading a file whose filename and sha256 both match an existing file succeeds as a no-op.
- The setting is documented.
Context
This was identified as a concern in a hosted Pulp deployment where CI rebuilds of the same package version were inadvertently replacing published content, breaking downstream consumers relying on hash-pinned installs.