OCPP: clean up charge point lifecycle between subtests#29873
Merged
Conversation
The OCPP test suite shares a singleton CS server across subtests (ocpp.Instance()) and the drain goroutine for triggerC was never stopped — every startChargePoint call leaked one goroutine for the remainder of the test process. The triggerC buffer was also only 1, so concurrent handler invocations spawned extra send goroutines that piled up under CI CPU pressure. This is the structural race that the prior fixes (7a24dbd #29725, e07838b #29838) did not address: those made specific handler sends asynchronous to unblock the WebSocket read loop, but the leaked drain goroutines and the buffer-1 channel remained. The result is still intermittent TestOcpp/TestAutoStart 10s timeouts on CI runners. Increase the triggerC buffer to 16 and register a t.Cleanup that stops the WS client and signals the drain goroutine to exit. We cannot close triggerC directly because ocpp_test_handler.go dispatches sends via `go func() { triggerC <- … }()`, which would panic on a closed channel.
Contributor
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In the drain goroutine, consider using
case msg, ok := <-handler.triggerCand returning when!okso that a future change that closestriggerCdoesn’t result in an infinite loop continuously reading the zero value from a closed channel.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the drain goroutine, consider using `case msg, ok := <-handler.triggerC` and returning when `!ok` so that a future change that closes `triggerC` doesn’t result in an infinite loop continuously reading the zero value from a closed channel.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The OCPP test suite shares a singleton CS server across subtests via
ocpp.Instance(), and the drain goroutine started instartChargePointforhandler.triggerCwas never stopped — every call leaked one goroutine for the lifetime of the test process. ThetriggerCchannel was also onlymake(chan …, 1), so concurrent handler invocations spawned extra sender goroutines that piled up under CI CPU pressure.This is the structural race that previous OCPP test/test-adjacent fixes did not address:
OCPP: fix flaky test deadlock between trigger handler and WS read loop— switcheddefer triggerC <- …togo func() { triggerC <- … }()so handlers no longer block the WS read loop.OCPP: dispatch RemoteStartTransaction asynchronously to avoid WebSocket deadlock— fixed a related deadlock inOnStatusNotification→RemoteStartTransactionRequest.Both removed specific blocking sends inside handlers, but the leaked drain goroutines and the capacity-1 buffer remained — visible as the intermittent
TestOcpp/TestAutoStart10s timeout (followed byconnection refusedonTestConnect/TestTimeout) seen on otherwise unrelated PRs (e.g. #29861, plus recent Solax / Huawei EMMA branches).This PR:
triggerCbuffer from 1 to 16 — handlers dispatch viago func() { triggerC <- … }(), so a tight buffer just multiplies blocked-sender goroutines.t.CleanupperstartChargePointthat callscp.Stop()(if still connected) and signals the drain goroutine to exit via adonechannel.donechannel rather thanclose(triggerC), because the handler senders would otherwise panic withsend on closed channel.Test plan
go build ./charger/...go vet ./charger/go test -run TestOcpp -count=1 -timeout 60s ./charger/— 5 separate invocations, all pass (~26 s each)🤖 Generated with Claude Code