Skip to content

Revert "Clean up dropped resources in relayer construction (#1927)"#2026

Draft
ogtownsend wants to merge 1 commit intomainfrom
back-out-loop-resource-clean-up-on-main-branch
Draft

Revert "Clean up dropped resources in relayer construction (#1927)"#2026
ogtownsend wants to merge 1 commit intomainfrom
back-out-loop-resource-clean-up-on-main-branch

Conversation

@ogtownsend
Copy link
Copy Markdown
Contributor

This reverts commit c8e0d77.

This commit was reverted in 2.41.3-rc1 due to the keystore dropped socket connection
We need to make sure core develop branch stays in sync with the currently released CCIP version

Copilot AI review requested due to automatic review settings April 30, 2026 22:10
@ogtownsend ogtownsend requested review from a team as code owners April 30, 2026 22:10
@ogtownsend ogtownsend requested a review from jmank88 April 30, 2026 22:10
@github-actions
Copy link
Copy Markdown

👋 ogtownsend, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@github-actions
Copy link
Copy Markdown

📊 API Diff Results

No changes detected for module github.com/smartcontractkit/chainlink-common

View full report

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reverts commit c8e0d77d... (“Clean up dropped resources in relayer construction”) to keep develop aligned with the currently released CCIP version after a keystore socket drop regression in 2.41.3-rc1.

Changes:

  • Reverts relayer/CCIP provider construction to stop returning/propagating certain net.Resources dependency sets.
  • Adjusts several error paths in relayer construction to return nil deps.
  • Alters CSA keystore resource wiring in the relayer server’s NewRelayer path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return nil, net.ErrConnDial{Name: "CSAKeystore", ID: request.KeystoreCSAID, Err: err}
}
ksCSARes := net.Resource{Closer: ksCSAConn, Name: "CSAKeystore"}
ksCSARes := net.Resource{Closer: ksConn, Name: "CSAKeystore"}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ksCSARes is wrapping ksConn instead of ksCSAConn. This will cause the Keystore connection to be closed twice and will leak the CSA keystore connection (and any cleanup that relies on ksCSARes will act on the wrong connection). Use ksCSAConn as the Closer for the CSA resource.

Suggested change
ksCSARes := net.Resource{Closer: ksConn, Name: "CSAKeystore"}
ksCSARes := net.Resource{Closer: ksCSAConn, Name: "CSAKeystore"}

Copilot uses AI. Check for mistakes.
Comment on lines 72 to 74
if err != nil {
return 0, deps, fmt.Errorf("Failed to create relayer client: failed to serve CSA keystore: %w", err)
return 0, nil, fmt.Errorf("Failed to create relayer client: failed to serve CSA keystore: %w", err)
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this error path, the keystore server (ksRes) may already have been started and added to deps, but the function returns nil deps. Because clientConn.refresh() only closes the deps that are returned, this can leak the already-started server(s) when a later ServeNew/RPC step fails. Return the accumulated deps on error, or explicitly close them before returning.

Copilot uses AI. Check for mistakes.
Comment on lines 91 to 95
if err != nil {
return 0, deps, fmt.Errorf("Failed to create relayer client: failed request: %w", err)
return 0, nil, fmt.Errorf("Failed to create relayer client: failed request: %w", err)
}
return reply.RelayerID, deps, nil
return reply.RelayerID, nil, nil
})
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On success this returns nil deps even though several resources were added to deps. This means the keystore/CSA keystore/capability registry servers created via ServeNew are never tracked for cleanup by the clientConn lifecycle (refresh/teardown), which can leave long-lived servers running and accumulate resources if NewRelayer is called multiple times. Either return deps here, or move ownership/cleanup of these resources somewhere that is guaranteed to close them when the relayer is closed.

Copilot uses AI. Check for mistakes.
})
if err != nil {
return 0, deps, err
return 0, nil, err
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ExtraDataCodecBundle is successfully served and then NewCCIPProvider returns an error, this currently returns nil deps, so edcRes won’t be cleaned up by the clientConn lifecycle and the server can be left running. Return the accumulated deps on error (or close edcRes before returning).

Suggested change
return 0, nil, err
return 0, deps, err

Copilot uses AI. Check for mistakes.
@ogtownsend ogtownsend marked this pull request as draft May 4, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants