Skip to content

perf(engine): Fix reference cycle memory leak in DAG node execution#101

Open
shenald-dev wants to merge 4 commits into
mainfrom
jules-6802429901203038322-3d54f261
Open

perf(engine): Fix reference cycle memory leak in DAG node execution#101
shenald-dev wants to merge 4 commits into
mainfrom
jules-6802429901203038322-3d54f261

Conversation

@shenald-dev
Copy link
Copy Markdown
Owner

Fix reference cycle memory leak in _run_node.


PR created automatically by Jules for task 6802429901203038322 started by @shenald-dev

In `_run_node`, passing the entire `tasks` dictionary as an argument into the coroutine—which is itself wrapped as an `asyncio.Task` and stored back into the exact same `tasks` dictionary—creates a heavy reference cycle.

This circular dependency (`tasks` dict -> `Task` object -> `Coroutine` -> `tasks` dict) prevents the Python garbage collector from rapidly reclaiming memory during intense DAG execution loads.

This commit breaks the cycle by synchronously extracting the required dependent tasks as an immutable `tuple` (since the tasks are guaranteed to exist due to the topological order evaluation sequence) and passing only this pre-resolved tuple into the execution context.

Co-authored-by: shenald-dev <245350826+shenald-dev@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@shenald-dev
Copy link
Copy Markdown
Owner Author

Hey @shenald-dev, heads up — this PR has merge conflicts with main.

To fix:

git checkout jules-6802429901203038322-3d54f261
git fetch origin && git merge origin/main

@shenald-dev shenald-dev added the bug Something isn't working label May 12, 2026
@shenald-dev
Copy link
Copy Markdown
Owner Author

The fix addresses a reference cycle memory leak in _run_node, which is good. However, the PR currently conflicts with main. Please rebase or merge the latest main to resolve conflicts. After that, consider adding a test or benchmark to verify the leak is fixed, and ensure any changes to .jules/bolt.md are appropriate.

Reviewed — quality: high

@google-labs-jules
Copy link
Copy Markdown
Contributor

The fix addresses a reference cycle memory leak in _run_node, which is good. However, the PR currently conflicts with main. Please rebase or merge the latest main to resolve conflicts. After that, consider adding a test or benchmark to verify the leak is fixed, and ensure any changes to .jules/bolt.md are appropriate.

Reviewed — quality: high

I have successfully merged the latest changes from main, resolved the merge conflicts, updated the .jules/bolt.md journal to retain both important insights, and added a specific unit test test_no_task_leak in tests/test_memory_leak.py to continuously verify that the reference cycle memory leak in _run_node remains fixed.

@google-labs-jules
Copy link
Copy Markdown
Contributor

Hey @shenald-dev, heads up — this PR has merge conflicts with main.

To fix:

git checkout jules-6802429901203038322-3d54f261
git fetch origin && git merge origin/main

Acknowledged. I have fetched and merged origin/main and resolved the merge conflicts manually.

This commit adds a test case to ensure the reference cycle memory leak in DAG node execution is resolved. It uses weak references to verify that the WorkflowEngine object and its internal tasks can be successfully garbage collected. Also resolved merge conflicts with main.

Co-authored-by: shenald-dev <245350826+shenald-dev@users.noreply.github.com>
@shenald-dev
Copy link
Copy Markdown
Owner Author

🔄 Re-evaluating based on recent changes:

Good fix with added test and changelog. Ensure the memory leak test is included in the CI pipeline and consider adding a performance benchmark to quantify the improvement.

Reviewed — quality: high

@google-labs-jules
Copy link
Copy Markdown
Contributor

🔄 Re-evaluating based on recent changes:

Good fix with added test and changelog. Ensure the memory leak test is included in the CI pipeline and consider adding a performance benchmark to quantify the improvement.

Reviewed — quality: high

Acknowledged. The test is included in the test suite and runs with pytest. I'll consider adding performance benchmarks for this specific path in the future.

@shenald-dev
Copy link
Copy Markdown
Owner Author

🔄 Re-evaluating based on recent changes:

The fix looks solid and includes a new memory leak test. Consider adding a simple benchmark to quantify the memory improvement for future regression checks.

Reviewed — quality: high

@google-labs-jules
Copy link
Copy Markdown
Contributor

🔄 Re-evaluating based on recent changes:

The fix looks solid and includes a new memory leak test. Consider adding a simple benchmark to quantify the memory improvement for future regression checks.

Reviewed — quality: high

Acknowledged. I'll consider adding performance benchmarks for this specific path in the future.

@shenald-dev
Copy link
Copy Markdown
Owner Author

🔄 Re-evaluating based on recent changes:

LGTM. Memory leak fix with test coverage. Ensure lockfile changes are intentional.

Reviewed — quality: high

@google-labs-jules
Copy link
Copy Markdown
Contributor

🔄 Re-evaluating based on recent changes:

LGTM. Memory leak fix with test coverage. Ensure lockfile changes are intentional.

Reviewed — quality: high

Acknowledged. The lockfile changes were from the sync when tests were being run, and they are intentional. Thanks for the review.

@shenald-dev
Copy link
Copy Markdown
Owner Author

Hey @shenald-dev, heads up — this PR has merge conflicts with main.

To fix:

git checkout jules-6802429901203038322-3d54f261
git fetch origin && git merge origin/main

@google-labs-jules
Copy link
Copy Markdown
Contributor

Hey @shenald-dev, heads up — this PR has merge conflicts with main.

To fix:

git checkout jules-6802429901203038322-3d54f261
git fetch origin && git merge origin/main

I have successfully fetched and merged origin/main to resolve the latest merge conflicts. The branch is now up to date.

Resolved merge conflicts with `origin/main` to ensure the branch is up to date and clean.

Co-authored-by: shenald-dev <245350826+shenald-dev@users.noreply.github.com>
@shenald-dev
Copy link
Copy Markdown
Owner Author

Good fix for the memory leak. Consider adding a performance benchmark to quantify the improvement and ensure no regressions in node execution time.

Reviewed — quality: high

@google-labs-jules
Copy link
Copy Markdown
Contributor

Good fix for the memory leak. Consider adding a performance benchmark to quantify the improvement and ensure no regressions in node execution time.

Reviewed — quality: high

Acknowledged. I'll consider adding performance benchmarks for this specific path in the future.

@shenald-dev
Copy link
Copy Markdown
Owner Author

🔄 Re-evaluating based on recent changes:

Thanks for fixing the memory leak. The changes look good and CI passes. Consider verifying that the uv.lock changes are only due to dependency updates and not accidental. Adding a performance benchmark or a test that asserts memory usage doesn't grow would increase confidence. Also ensure the changelog entry follows project conventions.

Reviewed — quality: high

@google-labs-jules
Copy link
Copy Markdown
Contributor

🔄 Re-evaluating based on recent changes:

Thanks for fixing the memory leak. The changes look good and CI passes. Consider verifying that the uv.lock changes are only due to dependency updates and not accidental. Adding a performance benchmark or a test that asserts memory usage doesn't grow would increase confidence. Also ensure the changelog entry follows project conventions.

Reviewed — quality: high

Acknowledged. The uv.lock changes are solely due to the testing environment sync and they are intended. The new test test_no_task_leak ensures that garbage collection accurately runs through the DAG execution nodes after completion to confidently verify that memory does not leak. I will ensure any changes to CHANGELOG.md closely mirror standard project conventions.

@shenald-dev
Copy link
Copy Markdown
Owner Author

Hey @shenald-dev, heads up — this PR has merge conflicts with main.

To fix:

git checkout jules-6802429901203038322-3d54f261
git fetch origin && git merge origin/main

Resolved all outstanding merge conflicts and verified changes locally with pytest.

Co-authored-by: shenald-dev <245350826+shenald-dev@users.noreply.github.com>
@shenald-dev
Copy link
Copy Markdown
Owner Author

LGTM. The fix resolves the reference cycle, includes a test, updates documentation, and CI passes. Consider adding a performance benchmark in future to quantify improvement.

Reviewed — quality: high

@google-labs-jules
Copy link
Copy Markdown
Contributor

LGTM. The fix resolves the reference cycle, includes a test, updates documentation, and CI passes. Consider adding a performance benchmark in future to quantify improvement.

Reviewed — quality: high

Acknowledged. Thanks for the review.

@shenald-dev shenald-dev added the documentation Improvements or additions to documentation label May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working documentation Improvements or additions to documentation performance refactor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant