Skip to content

Avoid overwriting get_downloadable_url version with archive filename#268

Merged
soimkim merged 1 commit into
mainfrom
ver
Apr 22, 2026
Merged

Avoid overwriting get_downloadable_url version with archive filename#268
soimkim merged 1 commit into
mainfrom
ver

Conversation

@soimkim
Copy link
Copy Markdown
Contributor

@soimkim soimkim commented Apr 22, 2026

Description

  • Improved version extraction for Maven artifacts with classifier suffixes (e.g., -sources, -javadoc), ensuring accurate base version identification.
  • Enhanced version hint assignment logic to prevent unintended overrides during download operations.

@soimkim soimkim self-assigned this Apr 22, 2026
@soimkim soimkim added the bug fix [PR] Fix the bug label Apr 22, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

📝 Walkthrough

Walkthrough

Enhanced Maven version handling in the download utility by adding regex-based extraction of base version numbers from classifier-suffixed strings and adjusting version-hint assignment precedence to prevent filename-derived hints from overriding upstream values.

Changes

Cohort / File(s) Summary
Maven Classifier Suffix Handling
src/fosslight_util/download.py
Added _MAVEN_CLASSIFIER_SUFFIX_VERSION regex pattern to parse Maven classifier suffixes (e.g., -sources, -javadoc). Updated _version_string_from_archive_stem() to extract and return the base numeric version when such suffixes are detected. Modified download_wget() to assign oss_version only when it remains empty, preventing filename-derived hints from overriding previously determined versions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

chore

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: preventing version hints from the archive filename from overwriting the version determined by get_downloadable_url().
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ver

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fosslight_util/download.py`:
- Around line 324-327: The _MAVEN_CLASSIFIER_SUFFIX_VERSION regex only matches
when the classifier follows a purely numeric version and thus misses qualifier
segments (e.g., -RC1, -M5); update the pattern assigned to
_MAVEN_CLASSIFIER_SUFFIX_VERSION to accept optional qualifier tokens after the
numeric version (letters/digits/hyphens/periods) before the classifier, so
strings like "1.0.0-RC1-sources" and "3.0.0-M5-javadoc" are matched and stripped
the same as plain numeric versions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8275de1e-956b-40dd-9c9e-d7c223ad0f02

📥 Commits

Reviewing files that changed from the base of the PR and between 371c57d and 9c598c8.

📒 Files selected for processing (1)
  • src/fosslight_util/download.py

Comment on lines +324 to +327
_MAVEN_CLASSIFIER_SUFFIX_VERSION = re.compile(
r'^(\d+(?:\.\d+){1,3})-(?:sources?|src|javadoc)(?:[-.].*)?$',
re.IGNORECASE,
)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import re

archive_tail = re.compile(
    r'-((?:\d+\.)+\d+(?:-[0-9A-Za-z][0-9A-Za-z.-]*)?)$',
    re.IGNORECASE,
)
classifier = re.compile(
    r'^(\d+(?:\.\d+){1,3})-(?:sources?|src|javadoc)(?:[-.].*)?$',
    re.IGNORECASE,
)

for stem in [
    "artifact-2.12-sources",
    "artifact-1.0.0-RC1-sources",
    "artifact-3.0.0-M5-javadoc",
]:
    m = archive_tail.search(stem)
    version = m.group(1) if m else stem
    c = classifier.match(version)
    print(f"{stem} -> {c.group(1) if c else version}")
PY

Repository: fosslight/fosslight_util

Length of output: 191


🏁 Script executed:

sed -n '320,385p' src/fosslight_util/download.py

Repository: fosslight/fosslight_util

Length of output: 2211


🏁 Script executed:

#!/bin/bash
python - <<'PY'
import re

# Current regex
current_classifier = re.compile(
    r'^(\d+(?:\.\d+){1,3})-(?:sources?|src|javadoc)(?:[-.].*)?$',
    re.IGNORECASE,
)

# Proposed regex
proposed_classifier = re.compile(
    r'^(\d+(?:\.\d+){1,3}(?:-[0-9A-Za-z][0-9A-Za-z.-]*)?)-(?:sources?|src|javadoc)(?:[-.].*)?$',
    re.IGNORECASE,
)

# Archive tail regex (context)
archive_tail = re.compile(
    r'-((?:\d+\.)+\d+(?:-[0-9A-Za-z][0-9A-Za-z.-]*)?)$',
    re.IGNORECASE,
)

test_cases = [
    "artifact-2.12-sources",
    "artifact-1.0.0-RC1-sources",
    "artifact-3.0.0-M5-javadoc",
    "artifact-1.0.0-SNAPSHOT-sources",
    "artifact-2.5.0-beta1-javadoc",
]

print("CURRENT vs PROPOSED regex results:")
print("=" * 80)
for stem in test_cases:
    m = archive_tail.search(stem)
    version = m.group(1) if m else stem
    
    current_match = current_classifier.match(version)
    proposed_match = proposed_classifier.match(version)
    
    current_result = current_match.group(1) if current_match else version
    proposed_result = proposed_match.group(1) if proposed_match else version
    
    print(f"Input: {stem}")
    print(f"  Version extracted: {version}")
    print(f"  Current result:  {current_result}")
    print(f"  Proposed result: {proposed_result}")
    print()
PY

Repository: fosslight/fosslight_util

Length of output: 879


Fix Maven classifier regex to handle qualifier versions.

The current regex only matches classifiers immediately after numeric versions. Qualifier versions like 1.0.0-RC1-sources and 3.0.0-M5-javadoc are not stripped, allowing -sources/-javadoc to leak into version hints.

Regex fix
 _MAVEN_CLASSIFIER_SUFFIX_VERSION = re.compile(
-    r'^(\d+(?:\.\d+){1,3})-(?:sources?|src|javadoc)(?:[-.].*)?$',
+    r'^(\d+(?:\.\d+){1,3}(?:-[0-9A-Za-z][0-9A-Za-z.-]*)?)-(?:sources?|src|javadoc)(?:[-.].*)?$',
     re.IGNORECASE,
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fosslight_util/download.py` around lines 324 - 327, The
_MAVEN_CLASSIFIER_SUFFIX_VERSION regex only matches when the classifier follows
a purely numeric version and thus misses qualifier segments (e.g., -RC1, -M5);
update the pattern assigned to _MAVEN_CLASSIFIER_SUFFIX_VERSION to accept
optional qualifier tokens after the numeric version
(letters/digits/hyphens/periods) before the classifier, so strings like
"1.0.0-RC1-sources" and "3.0.0-M5-javadoc" are matched and stripped the same as
plain numeric versions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor. 나중에 발견되면 수정

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soimkim, understood — acknowledged as a minor issue to revisit later. Would you like me to open a follow-up GitHub issue to track the _MAVEN_CLASSIFIER_SUFFIX_VERSION regex fix (handling qualifier versions like 1.0.0-RC1-sources, 3.0.0-M5-javadoc) so it isn't forgotten?


🧠 Learnings used
Learnt from: soimkim
Repo: fosslight/fosslight_util PR: 264
File: src/fosslight_util/download.py:345-350
Timestamp: 2026-04-09T02:38:06.970Z
Learning: In `src/fosslight_util/download.py`, the `_try_semver_checkout` function intentionally returns `(True, "", "")` when an exact semver patch ref is not found among the remote refs. This is a deliberate design decision: the empty ref signals `download_git_repository` to fall back and clone the default branch (HEAD) silently, rather than failing hard on a missing ref.

@soimkim soimkim merged commit 11a628f into main Apr 22, 2026
7 of 8 checks passed
@soimkim soimkim deleted the ver branch April 22, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix [PR] Fix the bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant