Skip to content

Use timer unconditionally in BusyIndicator.showWhile(Future)#3167

Closed
akurtakov wants to merge 1 commit intoeclipse-platform:masterfrom
akurtakov:busy
Closed

Use timer unconditionally in BusyIndicator.showWhile(Future)#3167
akurtakov wants to merge 1 commit intoeclipse-platform:masterfrom
akurtakov:busy

Conversation

@akurtakov
Copy link
Copy Markdown
Member

Should prevent hangs like
#3044

@akurtakov akurtakov requested a review from laeubi April 1, 2026 08:03
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Test Results

  170 files  ±0    170 suites  ±0   28m 3s ⏱️ + 1m 1s
4 679 tests ±0  4 658 ✅ ±0   21 💤 ±0  0 ❌ ±0 
6 592 runs  ±0  6 437 ✅ ±0  155 💤 ±0  0 ❌ ±0 

Results for commit 3d65ac2. ± Comparison against base commit e476728.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

@HeikoKlare HeikoKlare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for proposing a solution for the BusyIndicator getting stuck. Having that fixed so that CI executions terminate regularly again will significantly improve the contributor experience ,

This appears to be a reasonable solution for avoiding that the BusyIndicator gets stuck. It does not add any considerable overhead but makes the implementation more robust against the wake() call once the Future has finished not having the intended effect.
I see two possible improvements but consider neither of them mandatory for further processing the PR:

  1. After existing the display spinning loop (because the future is done and the display was woken up) the timerExec might be cancelled to avoid an unnecessary wakeup of the display (but it should not be a problem to have that unnecessary wakup though).
  2. Using this timer even with a CompletionStage future might be limited to platforms on which we know about problems with wake() calls (i.e., in particular Linux). But as said above, the overhead is so minimal that it may be considered irrelevant, so making the behavior platform-specific may introduce unnecessary complexity.

This change might also be reverted once #3060 is addressed.

And there is small chance that the underlying issue is addressed by #3161 (as also indicated in #3161 (review)), so we should wait for some builds with that fix and see if the BusyIndicator test still gets stuck.

@HeikoKlare
Copy link
Copy Markdown
Contributor

Turns out that #3161 did not fix the issue of the BusyIndicator test getting stuck.

#3166 was just rebased on master including the the mentioned change, but the current test execution got stuck at the BusyIndicator test again:
https://github.com/eclipse-platform/eclipse.platform.swt/actions/runs/23844044308/job/69506754851?pr=3166

Copy link
Copy Markdown
Contributor

@laeubi laeubi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right direction.

It changes production code that never was reported to be stuck instead of fixing the problem (we just do guess working here unless someone was able to reliable reproduce it) and hiding a possibly serve bug in SWT.

I would be fine if we maybe enhance the test with an additional guard e.g. lets say we start a (java) timer that hwne it fires constantly tries to wake up the SWT thread and make the test then fail an assertion.

@laeubi
Copy link
Copy Markdown
Contributor

laeubi commented Apr 1, 2026

@HeikoKlare if the problem is currently visible then I think the next easy fix would be to try out making it volatile:

@HeikoKlare
Copy link
Copy Markdown
Contributor

It changes production code that never was reported to be stuck instead of fixing the problem (we just do guess working here unless someone was able to reliable reproduce it) and hiding a possibly serve bug in SWT.

It's based on your own assessment that the issue might only be visible on test execution but not in productive usage because there are other events happening because of human input that wake up the display: #3044 (comment)

I think the next easy fix would be to try out making it volatile:

Yes, we could do that. But then we should proceed with that soon. We could also just revert this change once #3060 is addressed and proves to fix the issue. I just don't think we should further wait for other options to maybe fix the issue if no one is working on them while we have a proposed solution that seems to work and does not harm. The CI builds freqeuently getting stuck are quite annoying and limit productivity, so have a timely solution (even if it's a temporary one) would be great.

@akurtakov
Copy link
Copy Markdown
Member Author

I fully agree that current state can not continue . I would say - if we don't have another fix in and tests still stuck on Monday, let's merge this one as it's already 2 months with this state.

@laeubi
Copy link
Copy Markdown
Contributor

laeubi commented Apr 1, 2026

I fully agree that current state can not continue . I would say - if we don't have another fix in and tests still stuck on Monday, let's merge this one as it's already 2 months with this state.

I disagree here: The change is not backed by any proof it will actually fix the stucked test, this is just guess working, I see not any trace that this is guaranteed to work and it changes code I use every day and never was hanging since it was introduced.

In the worst scenario I would better disable the test than changing the implementation for strange CI behavior we see here.

I already suggested a workaround that would only change test code and gave more insights... I now prepare a PR with my proposal and then we can see.

@akurtakov
Copy link
Copy Markdown
Member Author

There have been 2 full rebuilds of this PR without any of the verification jobs stuck. I've just rebased it for third run - if nothing is stuck again , this is a proof enough for me as there haven't been any other PR lately that had even 2 good builds in a row.
Let's see what the state is on Monday.

@HeikoKlare
Copy link
Copy Markdown
Contributor

CI builds get stuck so often because of the issue that we will easily see if this fix has an impact or not once it has been merged and can, if necessary, simply revert it.
I have to admit that I do not fully understand why the root cause being in production code is challenged so heavily now, as I understood this comment as an assumption of exactly that and and explanation why it does not produce any noticable issue during productive use though.

Anyway, I think we have a good proposal/agreement now: let's have a look at the next proposals, such as the upcoming one from Christoph, and see if we can make progress with that. If we do not find a more suitable solution until Monday, let's proceed with this one.

@laeubi
Copy link
Copy Markdown
Contributor

laeubi commented Apr 1, 2026

There have been 2 full rebuilds of this PR without any of the verification jobs stuck. I've just rebased it for third run - if nothing is stuck again , this is a proof enough for me as there haven't been any other PR lately that had even 2 good builds in a row.

I have builds with 20 reruns and without an issue, so that's not really a proof. Anyways I still think it is not the right think to fix a test failure only ever reproducible on CI and only ever through certain times.

@akurtakov
Copy link
Copy Markdown
Member Author

Analysis have shown better ways to do in the original issue.

@akurtakov akurtakov closed this Apr 2, 2026
@akurtakov akurtakov deleted the busy branch April 2, 2026 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants