feat(scoring): drop density, swap base score to exponential saturation#1352
Open
seroperson wants to merge 1 commit into
Open
feat(scoring): drop density, swap base score to exponential saturation#1352seroperson wants to merge 1 commit into
seroperson wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the density-based base score with an exponential saturation curve on source token score, per the discussion and agreed shape in #1339:
The new curve is monotonic in
src_tok, concave from the origin, and ruled by one knob (SRC_TOK_SATURATION_SCALE = 58, per-repo overridable).Per-repo override range is
[10, 500], enforced by the loader and pinned by the live-config test intests/validator/test_load_weights.py.Requirements
base_score = 25 * (1 - exp(-src_tok / 58)) + min(total_score / 1500, 1) * 5MAX_CODE_DENSITY_MULTIPLIERMIN_TOKEN_SCORE_FOR_BASE_SCOREfromscoring.py(the cliff)MIN_TOKEN_SCORE_FOR_BASE_SCOREfromcredibility.py(valid_merged_countfilter)SRC_TOK_SATURATION_SCALE = 58SRC_TOK_SATURATION_SCALEper-repo configurable viamaster_repositories.jsonscoring blockMAX_CONTRIBUTION_BONUS: 25 → 5MERGED_PR_BASE_SCORE = 25CONTRIBUTION_SCORE_FOR_FULL_BONUS = 1500TEST_FILE_CONTRIBUTION_WEIGHT = 0.05[10, 500]range forSRC_TOK_SATURATION_SCALEvia the load weights test suiteBehavior changes
New score bounds:
initial_base_score: 28.75 -> 25base_score: 53.75 -> 30Also:
MAX_CONTRIBUTION_BONUS25 -> 5.check_eligibility) is now a pure count of merged PRs; per-repomin_token_score_for_base_scorefield removed._build_solving_pr_cacheno longer pre-filters by token score - every merged PR is reusable for issue-discovery lookups.PullRequest.code_density/ScoredMirrorPR.code_densitydefault flips0.0->1.0. The field is no longer populated by the scorer (density is gone), but the column survives in the DB and is read by downstream consumers as a neutral multiplier -1.0keeps it neutral,0.0would zero anything multiplying by it.Constants
MAX_CODE_DENSITY_MULTIPLIERMIN_TOKEN_SCORE_FOR_BASE_SCOREMAX_CONTRIBUTION_BONUSSRC_TOK_SATURATION_SCALEMERGED_PR_BASE_SCORECONTRIBUTION_SCORE_FOR_FULL_BONUSTEST_FILE_CONTRIBUTION_WEIGHTThe full multiplier chain (time decay, review quality, label, issue, spam, credibility) is untouched.
Per-repo
src_tok_saturation_scaleOptional knob on each
master_repositories.jsonscoringblock:Loader rejects values outside
[10, 500]. No entry inmaster_repositories.jsonsets the knob today; default isSRC_TOK_SATURATION_SCALE = 58.0.Open concerns / follow-ups
code_densitycolumn lingers in the DB. Density is gone from scoring, but the column survives onpull_requests(written by the upsert ingittensor/validator/storage/queries.py) and the field survives onPullRequest/ScoredMirrorPR. The default flip to1.0keeps downstream consumers safe, but I'm unsure how DB schema removal is handled in this repo. Open question: schema-level removal as a follow-up PR, or leave the column as a permanent neutral1.0write?MIN_VALID_MERGED_PRS = 3. A miner can clear the gate with three test-only / non-code merged PRs. Per direction (comment), eligibility is intentionally "a pure count of 3 merged PRs", but still worth tracking: should there be maybe an opt-inrequire_source_for_eligibilityper-repo knob so test-only/non-code-only PRs count toward credibility but not the minimum-merged gate?Related Issues
Closes #1339.
Type of Change
Testing
gitt miner scoreChecklist
test)Test plan
New coverage:
TestPerRepoSaturationScaleOverride- default scale + per-repo override reshapes the curve in both directions.test_loader_rejects_out_of_range_saturation_scale+test_loader_accepts_saturation_scale_at_bounds- parametrized[10, 500]bounds.test_live_mirror_scoring_fields_have_valid_shape- livemaster_repositories.jsonpinned to the same bound for every repo.