Skip to content

Add red 'stalled' status to admin dashboard banner#499

Merged
dereuromark merged 1 commit into
masterfrom
admin-dashboard-stalled-status-banner
May 14, 2026
Merged

Add red 'stalled' status to admin dashboard banner#499
dereuromark merged 1 commit into
masterfrom
admin-dashboard-stalled-status-banner

Conversation

@dereuromark
Copy link
Copy Markdown
Owner

Summary

The dashboard banner gains a third state so the operator sees a clear "action required" signal when work piles up and nothing is processing it.

state when
green Queue Running heartbeat < Queue.dashboardIdleAfter (60s default)
yellow Queue Idle heartbeat ≥ Queue.dashboardIdleAfter, OR no heartbeat & no backlog
red Queue Stalled heartbeat ≥ Queue.dashboardStalledAfter (120s default) with a pending backlog and no in-flight job; OR no worker has reported at all and there's a pending backlog or a stuck fetched job

When red, the banner expands with a small diagnostic grid (last activity absolute + relative, workers / servers, pending count) and a one-line cause hint that names the likely fault and the relevant CLI command. See screenshots below.

What red replaces

Two cases were previously ambiguous or hidden:

  1. Pending jobs piling up with workers registered but not picking them up — the banner used to say "Queue Idle" indefinitely, regardless of how stale the heartbeat got.
  2. Total cron outage — when every queue_processes row aged past Queue.defaultRequeueTimeout, QueueProcessesTable::status() returns empty and the previous template fell through to a muted info notice ("No queue status available"). That's the worst case and is now the loudest red.

Conditions that keep red from false-positiving

  • workers == 0 alone does not trigger red. In cron-driven mode that's the normal idle state for a quiet system, so red also requires pendingJobs > 0.
  • runningJobs > 0 keeps a busy worker out of red. Heartbeats fire at the top of the worker loop, not during runJob(), so a long-running task (>2 min) with more pending behind it would otherwise look stalled by heartbeat age alone. runningJobs is already computed in the controller (fetched IS NOT NULL AND completed IS NULL).

Config

Two new keys in the example config:

'dashboardIdleAfter' => 60,    // seconds
'dashboardStalledAfter' => 120, // seconds

Defaults are deliberate UI policy — human-perceptible 1-min / 2-min boundaries — not derived from queue mechanics. The existing knobs (workerLifetime, defaultRequeueTimeout, sleeptime) describe exit policy / job-reassignment safety / idle sleep cadence respectively; none of them actually mean "how stale before the dashboard should nag the admin." Installations with unusual cron cadence (e.g. slow exitwhennothingtodo cron) should raise dashboardStalledAfter past the cron interval to avoid false-red between ticks.

Files

  • templates/Admin/Queue/index.php — three-state banner, conditional red, no-status path also red on backlog/in-flight, diagnostic grid + cause hint
  • templates/layout/queue.php.status-banner.status-stalled CSS (red gradient + 4 px left accent)
  • config/app.example.php — new Queue.dashboardIdleAfter and Queue.dashboardStalledAfter keys with explanation

The dashboard banner now distinguishes three states instead of two:

- green Queue Running  (heartbeat < dashboardIdleAfter, 60s default)
- yellow Queue Idle    (heartbeat >= dashboardIdleAfter, or no heartbeat
                        and no backlog)
- red Queue Stalled    (heartbeat >= dashboardStalledAfter, 120s default,
                        with a pending backlog AND no in-flight job; OR
                        no worker has reported at all and there is a
                        pending backlog or stuck in-flight job)

Red surfaces the case the old "Queue Idle" yellow blurred away: jobs are
piling up and nothing is processing them. The previous banner also showed
a muted info notice ("No queue status available") when every worker row
had aged past Queue.defaultRequeueTimeout — the worst-case full cron
outage. That path now produces the red banner too.

The conditions are tuned to avoid false positives:

- workers == 0 alone does not trigger red. In cron-driven mode that is
  the normal idle state for a quiet system.
- runningJobs > 0 keeps a busy worker out of red. Heartbeats fire at the
  top of each loop, not during runJob(), so a long-running task (>2 min)
  with more pending behind it would otherwise look stalled by heartbeat
  age alone.

When red, the banner expands with a diagnostic grid (last activity
absolute + relative, workers, pending) and a context-specific cause hint
that names the most likely fault and the relevant CLI command.

Thresholds are exposed as two new config keys:

- Queue.dashboardIdleAfter    (default 60, seconds)
- Queue.dashboardStalledAfter (default 120, seconds)

Defaults are deliberate UI policy — human-perceptible 1-min / 2-min
boundaries — not derived from queue mechanics. None of the existing
config knobs (workerLifetime, defaultRequeueTimeout, sleeptime) actually
mean "dashboard heartbeat freshness", so deriving from them coupled the
banner to unrelated semantics. Installations with unusual cron cadence
(e.g. slow exitwhennothingtodo cron) should raise dashboardStalledAfter
past the cron interval to avoid false-red between ticks.
@dereuromark dereuromark added this to the 8.x milestone May 14, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 14, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.34%. Comparing base (06abf17) to head (6e5602a).
⚠️ Report is 1 commits behind head on master.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff            @@
##             master     #499   +/-   ##
=========================================
  Coverage     78.34%   78.34%           
  Complexity      981      981           
=========================================
  Files            45       45           
  Lines          3330     3330           
=========================================
  Hits           2609     2609           
  Misses          721      721           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dereuromark dereuromark merged commit d777481 into master May 14, 2026
16 checks passed
@dereuromark dereuromark deleted the admin-dashboard-stalled-status-banner branch May 14, 2026 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants