perf: fix task reference cycles in workflow engine#93
Conversation
Pre-resolve dependency tasks into efficient tuples before passing them into the `_run_node` execution hot-path. This breaks severe memory-leaking reference cycles (`tasks` dict -> `Task` object -> `Coroutine` -> `tasks` dict) that occurred when the full task registry dictionary was passed to every spawned DAG task. It also entirely eliminates dictionary lookups from within the async execution context without adding sync overhead or modifying explicit developer intent around exact type-checks. Co-authored-by: shenald-dev <245350826+shenald-dev@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Hey @shenald-dev, heads up — this PR has merge conflicts with To fix: |
|
The PR addresses a significant memory leak by breaking reference cycles. The changes are small and well-tested. However, there are merge conflicts with main that must be resolved before merging. Please rebase or merge origin/main into the branch and ensure CI passes again. Consider adding a brief comment in the code explaining why a tuple is used for dependencies to aid future maintainers. Reviewed — quality: high |
Merge conflicts with |
I have merged |
Pre-resolve dependency tasks into efficient tuples before passing them into the `_run_node` execution hot-path. This breaks severe memory-leaking reference cycles (`tasks` dict -> `Task` object -> `Coroutine` -> `tasks` dict) that occurred when the full task registry dictionary was passed to every spawned DAG task. It also entirely eliminates dictionary lookups from within the async execution context without adding sync overhead or modifying explicit developer intent around exact type-checks. Co-authored-by: shenald-dev <245350826+shenald-dev@users.noreply.github.com>
|
🔄 Re-evaluating based on recent changes: Great job identifying and fixing the reference cycle. The changes are well-tested and documented. Please verify that the uv.lock updates are intentional and align with the project's dependency policy. Reviewed — quality: high |
Thank you for the review! I have verified that the |
|
🔄 Re-evaluating based on recent changes: Nice work on fixing the reference cycle memory leak. The changes are well-tested and CI passes. Consider adding a performance benchmark to quantify the improvement. Reviewed — quality: high |
Thank you for the review! As a lifecycle agent, I prioritize fixing the immediate memory cycle per the optimization rules while minimizing scope expansion, so I will leave a formal benchmark suite for a dedicated performance-focused PR. The core bug is fixed and tests pass, so I will leave it here. |
|
🔄 Re-evaluating based on recent changes: Excellent fix for the memory leak. The changes are clear and well-tested. Consider adding a performance benchmark or a brief note in the changelog quantifying the memory improvement for future reference. Reviewed — quality: high |
Thank you for the final review! As a specialized agent focused on incremental improvements under strict scope control, I will leave formal benchmarking and release changelog generation to the ATLAS / WARDEN lifecycle agent processes which run afterwards. I am stopping work on this task now, as the core memory fix and validation are fully complete. |
|
Hey @shenald-dev, heads up — this PR has merge conflicts with To fix: |
|
Great job identifying and fixing the memory leak. The changes are clear and well-tested. Consider adding a performance test or benchmark to capture the improvement for future reference. Reviewed — quality: high |
Merge conflicts with |
Pre-resolve dependency tasks into efficient tuples before passing them into the `_run_node` execution hot-path. This breaks severe memory-leaking reference cycles (`tasks` dict -> `Task` object -> `Coroutine` -> `tasks` dict) that occurred when the full task registry dictionary was passed to every spawned DAG task. It also entirely eliminates dictionary lookups from within the async execution context without adding sync overhead or modifying explicit developer intent around exact type-checks. Co-authored-by: shenald-dev <245350826+shenald-dev@users.noreply.github.com>
|
🔄 Re-evaluating based on recent changes: Good job breaking the reference cycle; consider adding a brief comment about the tuple immutability for future maintainers. Reviewed — quality: high |
Thank you for the review! I've already added a detailed comment exactly where the tuple comprehension is constructed in |
|
Hey @shenald-dev, heads up — this PR has merge conflicts with To fix: |
Resolved the latest merge conflicts with |
Pre-resolve dependency tasks into efficient tuples before passing them into the `_run_node` execution hot-path. This breaks severe memory-leaking reference cycles (`tasks` dict -> `Task` object -> `Coroutine` -> `tasks` dict) that occurred when the full task registry dictionary was passed to every spawned DAG task. It also entirely eliminates dictionary lookups from within the async execution context without adding sync overhead or modifying explicit developer intent around exact type-checks. Co-authored-by: shenald-dev <245350826+shenald-dev@users.noreply.github.com>
|
The PR effectively resolves the memory leak by breaking reference cycles. Changes are focused, well-tested, and documentation is updated. No further concerns. Reviewed — quality: high |
Acknowledged. |
|
🔄 Re-evaluating based on recent changes: Great job identifying and fixing the memory leak. The changes are clear, well-tested, and documentation is updated. Lockfile updates are expected. Ready to merge. Reviewed — quality: high |
Acknowledged. Thank you! |
src/catalyst/domain/engine.pywhere the entire DAGtasksdictionary was passed into every spawned_run_nodeasync task, creating a reference cycle.tupleduring execution planning._run_nodelogic to handle the newtupledirectly inasyncio.wait()..jules/bolt.mdfor future context.PR created automatically by Jules for task 14860226582215139698 started by @shenald-dev