Record multi-file external fixture evidence#16
Conversation
DeepSeekCode review of PR #16 (Record multi-file external fixture evidence)Let me examine the PR diff by looking at the actual changes in the repository. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d3c17d8852
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| input.observations.iter().any(|observation| { | ||
| observation.tool_name == "apply_patch" | ||
| && !observation.is_failure() | ||
| && observation.summary.contains(&request.path) | ||
| }) |
There was a problem hiding this comment.
Track pending explicit edits by replacement, not just file path
When multiple replace ... with ... in ... requests target the same file, the new pending-edit logic marks all of them complete after the first successful apply_patch, because edit_request_patch_succeeded only checks whether any successful patch summary contains the path. In that scenario the guardrail stops issuing subsequent required patches, so later replacements in the same file are silently skipped.
Useful? React with 👍 / 👎.
Summary
dogfood external-fixtureevidence and requirepost_validation_passedin external evidence verification.replace ... with ... in ...requests in one task, and emitpython3in the multi-file fixture scaffold.Validation
cargo fmt --checkgit diff --checkcargo test external_fixture --lib -- --test-threads=1cargo test multifile_edit --lib -- --test-threads=1cargo test derive_edit_requests_supports_multiple_file_replacements --lib -- --test-threads=1cargo test offline_planner_routes_git_blame_with_path_and_line --lib -- --test-threads=1cargo buildscripts/create-multifile-external-fixture.sh /tmp/deepseek-external-fixtures/python-invoice-multifile-20260523-releasetarget/debug/deepseek dogfood external-fixture --workdir /tmp/deepseek-external-fixtures/python-invoice-multifile-20260523-release --budget 12 --evidence-out .dscode/dogfood/external-fixture-python-invoice-multifile-evidence.json 'replace ... validate with python3 -m unittest discover -s tests'target/debug/deepseek dogfood external-evidence --file .dscode/dogfood/external-fixture-python-invoice-multifile-evidence.json --out .dscode/dogfood/external-fixture-python-invoice-multifile-verification.json --require-successful-external-fixtures 1 --jsoncargo test --lib -- --test-threads=1(1635 passed)node scripts/check-secrets.js