Conversation
📝 WalkthroughWalkthroughEnhanced Maven version handling in the download utility by adding regex-based extraction of base version numbers from classifier-suffixed strings and adjusting version-hint assignment precedence to prevent filename-derived hints from overriding upstream values. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fosslight_util/download.py`:
- Around line 324-327: The _MAVEN_CLASSIFIER_SUFFIX_VERSION regex only matches
when the classifier follows a purely numeric version and thus misses qualifier
segments (e.g., -RC1, -M5); update the pattern assigned to
_MAVEN_CLASSIFIER_SUFFIX_VERSION to accept optional qualifier tokens after the
numeric version (letters/digits/hyphens/periods) before the classifier, so
strings like "1.0.0-RC1-sources" and "3.0.0-M5-javadoc" are matched and stripped
the same as plain numeric versions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8275de1e-956b-40dd-9c9e-d7c223ad0f02
📒 Files selected for processing (1)
src/fosslight_util/download.py
| _MAVEN_CLASSIFIER_SUFFIX_VERSION = re.compile( | ||
| r'^(\d+(?:\.\d+){1,3})-(?:sources?|src|javadoc)(?:[-.].*)?$', | ||
| re.IGNORECASE, | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import re
archive_tail = re.compile(
r'-((?:\d+\.)+\d+(?:-[0-9A-Za-z][0-9A-Za-z.-]*)?)$',
re.IGNORECASE,
)
classifier = re.compile(
r'^(\d+(?:\.\d+){1,3})-(?:sources?|src|javadoc)(?:[-.].*)?$',
re.IGNORECASE,
)
for stem in [
"artifact-2.12-sources",
"artifact-1.0.0-RC1-sources",
"artifact-3.0.0-M5-javadoc",
]:
m = archive_tail.search(stem)
version = m.group(1) if m else stem
c = classifier.match(version)
print(f"{stem} -> {c.group(1) if c else version}")
PYRepository: fosslight/fosslight_util
Length of output: 191
🏁 Script executed:
sed -n '320,385p' src/fosslight_util/download.pyRepository: fosslight/fosslight_util
Length of output: 2211
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import re
# Current regex
current_classifier = re.compile(
r'^(\d+(?:\.\d+){1,3})-(?:sources?|src|javadoc)(?:[-.].*)?$',
re.IGNORECASE,
)
# Proposed regex
proposed_classifier = re.compile(
r'^(\d+(?:\.\d+){1,3}(?:-[0-9A-Za-z][0-9A-Za-z.-]*)?)-(?:sources?|src|javadoc)(?:[-.].*)?$',
re.IGNORECASE,
)
# Archive tail regex (context)
archive_tail = re.compile(
r'-((?:\d+\.)+\d+(?:-[0-9A-Za-z][0-9A-Za-z.-]*)?)$',
re.IGNORECASE,
)
test_cases = [
"artifact-2.12-sources",
"artifact-1.0.0-RC1-sources",
"artifact-3.0.0-M5-javadoc",
"artifact-1.0.0-SNAPSHOT-sources",
"artifact-2.5.0-beta1-javadoc",
]
print("CURRENT vs PROPOSED regex results:")
print("=" * 80)
for stem in test_cases:
m = archive_tail.search(stem)
version = m.group(1) if m else stem
current_match = current_classifier.match(version)
proposed_match = proposed_classifier.match(version)
current_result = current_match.group(1) if current_match else version
proposed_result = proposed_match.group(1) if proposed_match else version
print(f"Input: {stem}")
print(f" Version extracted: {version}")
print(f" Current result: {current_result}")
print(f" Proposed result: {proposed_result}")
print()
PYRepository: fosslight/fosslight_util
Length of output: 879
Fix Maven classifier regex to handle qualifier versions.
The current regex only matches classifiers immediately after numeric versions. Qualifier versions like 1.0.0-RC1-sources and 3.0.0-M5-javadoc are not stripped, allowing -sources/-javadoc to leak into version hints.
Regex fix
_MAVEN_CLASSIFIER_SUFFIX_VERSION = re.compile(
- r'^(\d+(?:\.\d+){1,3})-(?:sources?|src|javadoc)(?:[-.].*)?$',
+ r'^(\d+(?:\.\d+){1,3}(?:-[0-9A-Za-z][0-9A-Za-z.-]*)?)-(?:sources?|src|javadoc)(?:[-.].*)?$',
re.IGNORECASE,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/fosslight_util/download.py` around lines 324 - 327, The
_MAVEN_CLASSIFIER_SUFFIX_VERSION regex only matches when the classifier follows
a purely numeric version and thus misses qualifier segments (e.g., -RC1, -M5);
update the pattern assigned to _MAVEN_CLASSIFIER_SUFFIX_VERSION to accept
optional qualifier tokens after the numeric version
(letters/digits/hyphens/periods) before the classifier, so strings like
"1.0.0-RC1-sources" and "3.0.0-M5-javadoc" are matched and stripped the same as
plain numeric versions.
There was a problem hiding this comment.
@soimkim, understood — acknowledged as a minor issue to revisit later. Would you like me to open a follow-up GitHub issue to track the _MAVEN_CLASSIFIER_SUFFIX_VERSION regex fix (handling qualifier versions like 1.0.0-RC1-sources, 3.0.0-M5-javadoc) so it isn't forgotten?
🧠 Learnings used
Learnt from: soimkim
Repo: fosslight/fosslight_util PR: 264
File: src/fosslight_util/download.py:345-350
Timestamp: 2026-04-09T02:38:06.970Z
Learning: In `src/fosslight_util/download.py`, the `_try_semver_checkout` function intentionally returns `(True, "", "")` when an exact semver patch ref is not found among the remote refs. This is a deliberate design decision: the empty ref signals `download_git_repository` to fall back and clone the default branch (HEAD) silently, rather than failing hard on a missing ref.
Description
-sources,-javadoc), ensuring accurate base version identification.