Skip to content

fix: slop regex tightening + quote-or-die whitespace normalisation#47

Open
avrabe wants to merge 1 commit intomainfrom
fix/slop-regex-and-quote-normalization
Open

fix: slop regex tightening + quote-or-die whitespace normalisation#47
avrabe wants to merge 1 commit intomainfrom
fix/slop-regex-and-quote-normalization

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 1, 2026

Fixes Bug #12 from docs/agent-fleet/bugs.md (wave-1 LLM prompt engineer).

Two coupled fixes:

  • Slop filter: bare-word \bcould\b etc. dropped real findings ("could panic on null deref at line 42"). Replaced with phrase patterns anchored to surrounding context.

  • Quote-or-die: CRLF, tabs, trailing whitespace defeated String.includes. New normaliseForQuote() strips/collapses on both sides before comparing.

  • 839 tests pass (+5)

  • eslint clean

🤖 Generated with Claude Code

## Why
Wave-1 LLM prompt engineer flagged this as Bug #12 in
`docs/agent-fleet/bugs.md`. Two related problems with the slop filter:

1. **Bare-word patterns** (`\bcould\b`, `\bshould\b`, `\bmay\b`,
   `\bmight\b`, `\bconsider\s+/`) were too greedy. They dropped
   legitimate findings like "this could panic on null deref at line 42",
   "function should not be called from async context", "parseInt without
   radix might return NaN for hex strings". Real concrete claims that
   contained any of those bare words were silently filtered out.

2. **Quote-or-die was whitespace-fragile.** `String.includes(needle)`
   matched only on exact byte equality. Models routinely paraphrase
   whitespace — CRLF↔LF mismatches, tab→spaces normalisation, trailing
   whitespace stripped. Real findings were dropped because the model's
   `quoted_line` had a different leading-whitespace shape than the diff.

## What

### Slop filter rewritten as phrase patterns
Bare-word triggers replaced with phrase shapes that distinguish actual
filler from concrete claims:

  Before:  /\bcould\b/i               (kills "could panic at line 42")
  After:   /\bit\s+(?:might|could)\s+be\s+(?:wise|good|...)/i
           /\b(?:we|you)\s+should\s+(?:consider|...)/i
           /\bconsider\s+\w+ing\b/i  ("consider validating" gerund form)
           ...and a dozen more anchored to surrounding context.

Plus added explicit phrase patterns for: "in general", "in some cases",
"worth noting", "suggests that", "it would be wise", "improve
maintainability", "needs more testing", and bare hedging openers like
"it seems"/"arguably"/"likely".

### Quote-or-die now normalises before lookup
New `normaliseForQuote(s)` exported function:
- CRLF → LF
- strips per-line trailing whitespace
- collapses runs of internal whitespace within a line
- drops empty lines

Both the diff and the model's `quoted_line` are normalised before
`String.includes`, so paraphrased whitespace no longer drops real
findings. Semantic content is what matters.

## Source
Bug #12, wave-1 LLM prompt engineer (`docs/agent-fleet/bugs.md`).

## Test plan
- [x] 839 tests pass (was 834; net +5: added "it would be wise" /
      "worth noting" slop coverage; added "could panic at line 42" /
      "parseInt may return NaN" / "function should not be called from
      async context" non-slop coverage; added "this might lead to issues"
      removed from slop list as deliberately too generic to catch
      reliably)
- [x] eslint clean
- [ ] After deploy: a non-trivial AI review on a real PR should retain
      findings that previously got dropped because the model paraphrased
      whitespace or used "could"/"should"/"may" in a concrete claim.

## Risk & rollout
- Risk: medium-low. The slop filter now passes some content it previously
  rejected. Net effect is *more* findings surfaced, including some that
  may be soft. The strict-JSON contract + verdict-from-findings still
  bound output quality.
- Rollout: self-update on merge.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant