Skip to content

Record multi-file external fixture evidence#16

Merged
willamhou merged 1 commit into
mainfrom
record-multifile-external-evidence-20260523
May 23, 2026
Merged

Record multi-file external fixture evidence#16
willamhou merged 1 commit into
mainfrom
record-multifile-external-evidence-20260523

Conversation

@willamhou
Copy link
Copy Markdown
Owner

Summary

  • Add fail-closed post-validation to dogfood external-fixture evidence and require post_validation_passed in external evidence verification.
  • Teach explicit edit guardrails to process multiple replace ... with ... in ... requests in one task, and emit python3 in the multi-file fixture scaffold.
  • Record verified online Python invoice multi-file external fixture evidence and update README/status/spec gap wording.

Validation

  • cargo fmt --check
  • git diff --check
  • cargo test external_fixture --lib -- --test-threads=1
  • cargo test multifile_edit --lib -- --test-threads=1
  • cargo test derive_edit_requests_supports_multiple_file_replacements --lib -- --test-threads=1
  • cargo test offline_planner_routes_git_blame_with_path_and_line --lib -- --test-threads=1
  • cargo build
  • scripts/create-multifile-external-fixture.sh /tmp/deepseek-external-fixtures/python-invoice-multifile-20260523-release
  • target/debug/deepseek dogfood external-fixture --workdir /tmp/deepseek-external-fixtures/python-invoice-multifile-20260523-release --budget 12 --evidence-out .dscode/dogfood/external-fixture-python-invoice-multifile-evidence.json 'replace ... validate with python3 -m unittest discover -s tests'
  • target/debug/deepseek dogfood external-evidence --file .dscode/dogfood/external-fixture-python-invoice-multifile-evidence.json --out .dscode/dogfood/external-fixture-python-invoice-multifile-verification.json --require-successful-external-fixtures 1 --json
  • cargo test --lib -- --test-threads=1 (1635 passed)
  • node scripts/check-secrets.js

@github-actions
Copy link
Copy Markdown

DeepSeekCode review of PR #16 (Record multi-file external fixture evidence)

Let me examine the PR diff by looking at the actual changes in the repository.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d3c17d8852

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/model/deepseek.rs
Comment on lines +5279 to +5283
input.observations.iter().any(|observation| {
observation.tool_name == "apply_patch"
&& !observation.is_failure()
&& observation.summary.contains(&request.path)
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Track pending explicit edits by replacement, not just file path

When multiple replace ... with ... in ... requests target the same file, the new pending-edit logic marks all of them complete after the first successful apply_patch, because edit_request_patch_succeeded only checks whether any successful patch summary contains the path. In that scenario the guardrail stops issuing subsequent required patches, so later replacements in the same file are silently skipped.

Useful? React with 👍 / 👎.

@willamhou willamhou merged commit 3927b2d into main May 23, 2026
9 checks passed
@willamhou willamhou deleted the record-multifile-external-evidence-20260523 branch May 23, 2026 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant