CI: fix milestone-tag-assistant race when labels change post-merge#67337
CI: fix milestone-tag-assistant race when labels change post-merge#67337potiuk wants to merge 1 commit into
Conversation
The `milestone-tag-assistant.yml` workflow snapshots PR labels at the `get-pr-info` job (via `listPullRequestsAssociatedWithCommit`) and then spends ~1.5 minutes installing Breeze and running `breeze ci set-milestone`. If a maintainer adds and removes a backport label inside that window, the action commits to the stale-snapshot decision and sets the wrong milestone — see the incident on PR apache#67301 where a backport label that lived for 49 seconds caused an Airflow-3.2.3 milestone to be set on a `main`-only documentation PR. Re-read `issue.labels` from the freshly-fetched issue before computing the milestone. If the labels changed since the snapshot: - Honour any skip label that appeared after the snapshot. - Re-run `_determine_milestone_version` with the current labels and use the fresh decision; if the decision flips to "no milestone", bail out before posting the comment. Adds three regression tests covering the three race-window cases (backport label removed, replaced, skip label added) and updates two existing happy-path tests to populate `mock_issue.labels` so the re-read sees the same labels as the snapshot.
jason810496
left a comment
There was a problem hiding this comment.
Thanks for catching the race condition! (I didn't foresee this situation when developing)
| console_print(f"[info]Snapshot labels: {sorted(labels)}[/]") | ||
| console_print(f"[info]Current labels: {sorted(current_labels)}[/]") | ||
|
|
||
| if _should_skip_milestone_tagging(current_labels): |
There was a problem hiding this comment.
If I understand the event correctly, we should add additional check in _should_skip_milestone_tagging.
If there's any event that maintainer remove any backport label themself. We need to skip the auto tagging.
Even with the new if set(current_labels) != set(labels): check, I don't think the existing implementation of _should_skip_milestone_tagging would be enough.
airflow/dev/breeze/src/airflow_breeze/global_constants.py
Lines 832 to 833 in d0f981c
airflow/dev/breeze/src/airflow_breeze/commands/ci_commands.py
Lines 1142 to 1144 in d0f981c
Btw, I can help follow-up for this issue if you don't have bandwidth at this moment.
The `milestone-tag-assistant.yml` workflow snapshots PR labels at the `get-pr-info` job (via `listPullRequestsAssociatedWithCommit`) and then spends ~1.5 minutes installing Breeze and running `breeze ci set-milestone`. If a maintainer adds and removes a backport label inside that window, the action commits to the stale-snapshot decision and sets the wrong milestone.
This happened on #67301: a `backport-to-v3-2-test` label that lived for 49 seconds caused an `Airflow 3.2.3` milestone to be set on a `main`-only documentation PR. The workflow's comment justifying the decision was posted ~1.5 minutes after the label had already been removed.
Timeline (PR #67301)
Fix
Re-read `issue.labels` from the freshly-fetched issue (which is already in scope after the existing `milestone is not None` check) before computing the milestone. If the labels changed since the snapshot:
Adds three regression tests covering the three race-window cases (backport label removed, replaced, skip label added) and updates two existing happy-path tests to populate `mock_issue.labels` so the re-read sees the same labels as the snapshot.
Was generative AI tooling used to co-author this PR?
Generated-by: Claude Code (Opus 4.7) following the guidelines