Skip to content

Add createJob(unique: true) for fan-out dedup#498

Merged
dereuromark merged 1 commit into
masterfrom
feature/create-job-unique-dedup
May 13, 2026
Merged

Add createJob(unique: true) for fan-out dedup#498
dereuromark merged 1 commit into
masterfrom
feature/create-job-unique-dedup

Conversation

@dereuromark
Copy link
Copy Markdown
Owner

Why

createJob() has no built-in dedup. When a cron-driven dispatcher enqueues per-tenant work on every tick, it cheerfully inserts a new row each time — even if last tick's job for the same tenant is still pending or stuck. The result is a slow accumulation of identical pending jobs.

In one observed case (multi-tenant CakePHP app), a single decommissioned tenant's per-account task threw on every cron tick, leaving 5876 stuck pending rows in queued_jobs. The admin dashboard couldn't even render the queue list anymore — the OOM that's just been capped in #497.

isQueued() already exists for exactly this case. But every caller currently has to wrap each createJob() call in two lines of manual guard. This PR folds that guard into createJob() itself as an opt-in flag.

Usage

$queuedJobsTable->createJob(
    'VolunteerCheckOutReminder',
    ['account_uuid' => $accountUuid],
    [
        'reference' => 'volunteer_check_out:' . $accountUuid,
        'unique' => true,
    ],
);
  • First call: inserts a new pending job, returns the new entity.
  • Subsequent calls while that job is still pending: return the existing entity, no insert.
  • Once the original job completes: the next call inserts a fresh row, so scheduled runs continue to enqueue normally.

Design choices

  • Return the existing entity, not null, not throw. The caller opted in to skip — throwing forces try/catch on the planned path; nullable shift breaks QueuedJob return type. Returning the live entity keeps the signature stable and lets callers log/use the existing job id. Same mental model as findOrCreate.
  • Log at info, not warning. Dedup firing is the system working as intended, not an anomaly. Info level still surfaces it in dashboards without being noisy.
  • unique: true without reference throws InvalidArgumentException. Programming error — fail fast at the call site rather than silently insert.
  • Flag lives on JobConfig outside _keyMap. It's a request-time concern, not persisted state — won't leak into toArray() output or the queued_jobs row.
  • Default false, fully BC. Existing callers see no behavior change.

Race window

Two ticks landing in the same millisecond can both pass the isQueued() check and both insert. A DB-level unique constraint would close that race, but it requires a migration and a decision on how callers opt in per-table — out of scope here. The 99% effectiveness already kills the slow-buildup scenario this is built for, and the existing isQueued() doc page has the same caveat.

Changes

  • Queue\Config\JobConfig::setUnique() / isUnique() — new request-time flag.
  • JobConfig::fromArray() — accepts 'unique' key, plucks it out before the _keyMap loop so it doesn't trip the strict field lookup.
  • QueuedJobsTable::createJob() — when unique is set, runs the dedup query and returns the existing entity if found.
  • docs/guide/queueing-jobs.md — new section under "Avoiding parallel (re)queueing".
  • 5 new tests covering the dedup-hit, dedup-after-completion, scoped-by-job-task, BC-without-unique, and missing-reference paths.

How verified locally

  • vendor/bin/phpunit — 193 tests, all green (pre-existing deprecations from ExecuteTaskTest only).
  • vendor/bin/phpstan analyze — no errors.
  • vendor/bin/phpcs --parallel=16 — no errors.

When a fan-out dispatcher enqueues per-tenant work on every cron tick,
plain `createJob()` happily inserts a duplicate row even if the previous
tick's job for the same tenant is still pending or stuck. The result is
a slow accumulation of identical pending jobs -- in one observed case,
5876 stuck `VolunteerCheckOutReminder` rows from a single tenant whose
DB connection had been decommissioned.

`isQueued()` already exists for this, but every caller has to wrap each
`createJob()` in a two-line manual guard. This adds the guard to
`createJob()` itself as an opt-in `unique` flag:

    $queuedJobsTable->createJob(
        'VolunteerCheckOutReminder',
        ['account_uuid' => $accountUuid],
        [
            'reference' => 'volunteer_check_out:' . $accountUuid,
            'unique' => true,
        ],
    );

When `unique` is set and a pending (`completed IS NULL`) job exists for
the same `(reference, resolved job_task)` pair, the existing entity is
returned and no new row is inserted. The dedup hit is logged at info
level. `unique` without a `reference` throws `InvalidArgumentException`
at the call site -- failing fast beats silently inserting an undeduped
row.

The flag lives on `JobConfig` as a request-time property outside
`_keyMap`, so it never leaks into `toArray()` output or the
`queued_jobs` row. Default is `false`, fully BC.

Race window: two ticks landing simultaneously can both pass the
`isQueued()` check and both insert. A DB-level unique constraint would
close that, but it requires a migration and a decision on how callers
should opt in per-table -- out of scope for this PR. The 99%
effectiveness already kills the slow-buildup scenario this is built
for.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 13, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 89.28571% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.34%. Comparing base (c575000) to head (c93d80e).

Files with missing lines Patch % Lines
src/Config/JobConfig.php 62.50% 3 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #498      +/-   ##
============================================
+ Coverage     78.11%   78.34%   +0.23%     
- Complexity      975      981       +6     
============================================
  Files            45       45              
  Lines          3303     3330      +27     
============================================
+ Hits           2580     2609      +29     
+ Misses          723      721       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dereuromark dereuromark merged commit 06abf17 into master May 13, 2026
16 checks passed
@dereuromark dereuromark deleted the feature/create-job-unique-dedup branch May 13, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants