fix: bail from EnsureComponentByHash.Handle when newObj signals requeue#87
Merged
Conversation
A newObj factory may both signal a requeue (RequeueAPIErr/RequeueErr) and return a zero-value apply config, e.g. when a status patch fails and the caller wants the next reconciliation to retry. RequeueAPIErr cancels the handler context but does not unwind the goroutine, so the framework would dereference the zero value to attach a hash annotation and segfault. Check ctx.Err() after newObj(ctx) and return early. This matches the documented context-cancellation pattern used by state.Continue and queue.OnError. Adds a regression test that plumbs a real queue.Operations whose cancel func cancels the handler context, mimicking the manager's plumbing.
ecordell
approved these changes
May 8, 2026
Contributor
ecordell
left a comment
There was a problem hiding this comment.
LGTM
this is a fine addition, though generally the idea is to just not write newObj in a way that can fail. it should generally be a stateless fn that builds the object you want to apply
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
EnsureComponentByHash.Handlecalls the user-suppliednewObj(ctx)factory and then immediately dereferences the result viae.Hash(newObj)andnewObj.WithAnnotations(...). If the factory both signals a requeue (e.g.RequeueAPIErr) and returns a zero-value apply configuration, the framework segfaults on the deref.This change adds a
ctx.Err() != nilshort-circuit immediately afternewObj(ctx)and returns early. The check uses the canonical context-cancellation pattern already documented inqueue/handlers.goand used bystate.Continue,queue.OnError, and parallel execution.A regression test (
TestEnsureComponentByHashStopsWhenNewObjSignalsRequeue) plumbs a realqueue.Operationswhose cancel func cancels the handler context — mirroring the manager's plumbing — and asserts the handler does not panic, does not callapplyObject/deleteObject, records the requeue error, and exits cleanly.Why
We hit this in production downstream. Stack trace from the operator:
The receiver in
WithAnnotations(0x0, ...)confirmed the apply config was nil. The downstream factory had a path that calledRequeueAPIErr(when a status patch failed) and returnednil.RequeueAPIErrcancels the handler context via theOperations.cancelfunc but does not unwind the goroutine, so the nil reachedHandleand crashed the controller.Fixing the contract violation in the caller is necessary regardless, but the framework should also not segfault on a misbehaving factory. The
ctx.Err()check makes the framework honor the same "stop when the queue says stop" contract thatstate.ContinueandOnErroralready use.Test plan
(*ServiceApplyConfiguration).WithAnnotations(0x0, ...)atensure_component.go:59) and passes aftergo test -race ./...passes across all packagesgolangci-lint run ./component/...reports 0 issues