Skip to content

Addition of diagnostic collectors and bug fixes#6

Merged
vyruss merged 19 commits intomainfrom
feat/additions_202605
May 8, 2026
Merged

Addition of diagnostic collectors and bug fixes#6
vyruss merged 19 commits intomainfrom
feat/additions_202605

Conversation

@vyruss
Copy link
Copy Markdown
Collaborator

@vyruss vyruss commented May 2, 2026

Summary

Adds 5 new diagnostic collectors and extends 5 existing collectors with columns supporting deeper analysis: bloat, vacuum health, wraparound headroom, index validity/usage, sequence headroom, SSL audit, replication slot spill, extension version metadata. Also fixes the read_ahead system collector and splits the hostname collector to disambiguate kernel hostname from resolver-canonical FQDN.

New collectors:

  • stat_ssl (postgresql/) - per-backend SSL/TLS state from pg_stat_ssl
  • stat_replication_slots (postgresql/) - logical replication slot spill counters from pg_stat_replication_slots (PG14+)
  • sequences (databases/{db}/) - sequences with last/min/max values from pg_sequences
  • bloat (databases/{db}/) - heuristic table bloat estimate from pg_stats
  • pgstattuple (databases/{db}/) - authoritative bloat via pgstattuple_approx() when the extension is installed; disabled by default, run with -include pgstattuple

Extended collectors:

  • databases - datfrozenxid, datminmxid, datconnlimit, datistemplate, datallowconn
  • running_activity_maxage - adds max_lock_wait_age (longest current lock wait, NULL when nobody is waiting on a lock)
  • tables - rebuilt on pg_class + pg_stat_all_tables for vacuum/analyze timestamps and ages in seconds, dead-tup counters, reloptions, relpersistence, relpages, heap size, table size, reltoastrelid/toast size; capped at top 1000 by size
  • indexes - rebuilt on pg_index + pg_stat_all_indexes for semantic-key columns, indisvalid, idx_scan/idx_tup_read/idx_tup_fetch, relpages, index size; capped at top 1000 by size
  • available_extensions - now sourced from pg_available_extension_versions for per-version installed/trusted/superuser/relocatable/requires metadata

Bug fixes:

  • read_ahead system collector now handles virtio (/dev/vd*) and xen (/dev/xvd*) block devices and gracefully skips unmatched globs; output is now labeled per device.
  • hostname system collector split: system/hostname.out now contains plain hostname (kernel hostname / gethostname()), and a new system/hostname_fqdn.out contains hostname -f (resolver canonical name).

Collection control:

  • New -exclude name1,name2 flag skips named tasks (works for both instance-level and per-database tasks).
  • New -include name1,name2 flag enables default-disabled tasks. Currently the only default-disabled task is pgstattuple, because pgstattuple_approx() reads heap pages of every user table and shouldn't auto-fire just because the extension is present.

Archive metadata:

  • radar.out (archive root) - identifies the radar binary that produced the archive: version (from the build-time -X main.version stamp) and commit (short git SHA).

Checklist

  • Unit tests (TestQueryTaskColumnsCoverage asserts new column tokens and LIMIT 1000 stay in the queries; TestPostgreSQLCollectors covers stat_ssl and stat_replication_slots)
  • Regression tests (existing tests still pass)
  • Docs/README updated (README.md, docs/index.md, DATA.md, docs/data.md)
  • Integration test impact (all 6 scenarios pass on PG18; pgstattuple correctly skipped when extension absent; stat_replication_slots PG14+ guarded by existing 42P01 SkipError path)
  • Real-world validation (collected against a PG18 instance — SSL backends enumerated, sequences enumerated, bloat detected on the largest table, top-by-size ordering verified, pgstattuple correctly skipped when extension absent)
  • Security checks (no new attack surfaces, no new dependencies)
  • PR links to a tracking issue/ticket

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Docs, CLI, and collection logic updated: archive now includes a radar.out build/commit entry; new -exclude/-include task filters added; PostgreSQL collectors expanded (bloat/pgstattuple/sequences, stat_replication_slots, stat_ssl, richer tables/indexes); system hostname and read-ahead collection broadened; tests added for task selection and query coverage.

Changes

PostgreSQL Data Collection Expansion

Layer / File(s) Summary
Data Shape / Documentation
DATA.md, README.md, docs/data.md, docs/index.md
Docs updated: pg_available_extension_versions replaces pg_available_extensions; pg_stat_ssl and pg_stat_replication_slots documented; per-database outputs add bloat.tsv, pgstattuple.tsv (conditional), sequences.tsv; tables/indexes documentation now references pg_class+pg_stat_all_tables and pg_index+pg_stat_all_indexes with expanded size/stat/vacuum details.
Core Implementation (Query Tasks)
postgres_tasks.go
SQL/task changes: explicit column select from pg_available_extension_versions; databases adds frozenxid_age/minmxid_age; running_activity_maxage adds max lock-wait age; new instance task stat_replication_slots; per-database tasks added/replaced: bloat, pgstattuple, sequences; indexes and tables replaced with catalog+stat joins returning richer metrics.
Wiring / Collector Registration
postgres_test.go
TestPostgreSQLCollectors expected list updated to include stat_replication_slots and stat_ssl, with stat_slru reordered.
Tests / Validation
radar_test.go
TestQueryTaskColumnsCoverage added to assert required SQL tokens/subquery fragments exist in postgresQueryTasks and perDatabaseQueryTasks queries.

System Tasks & Hostname Reporting

Layer / File(s) Summary
Command/Task Behavior
system_tasks_linux.go
read_ahead task changed from a single blockdev --getra call to a shell loop over /dev/sd*, /dev/nvme*, /dev/vd*, /dev/xvd*, emitting per-device non-empty RA values.
Command/Task Addition
system_tasks_darwin.go
Added hostname_fqdn task that runs hostname -f and writes system/hostname_fqdn.out on macOS.
Documentation / Data Shape
DATA.md, docs/data.md
Hostname docs split: system/hostname.out from hostname (kernel/gethostname) and new system/hostname_fqdn.out from hostname -f (Linux/macOS).

Task Selection, Archive Metadata & CLI

Layer / File(s) Summary
Config / Flags
radar.go
Added -exclude and -include flags; Config gains Exclude []string and Include []string; default-disabled task list added (e.g., pgstattuple disabled unless included).
Selection Logic
radar.go
Added shouldRunTask logic matching trailing task-name segments; -include takes precedence over -exclude; collection now skips tasks that fail the filter (logged under very-verbose).
Archive Identity
radar.go
Added writeRadarMeta to write radar.out at ZIP root with version (ldflags) and truncated vcs.revision from build info; called before collection.
Tests / Validation
radar_test.go
TestShouldRunTask added to validate precedence rules between default-disabled tasks and -include/-exclude filters across task name variants.

Poem

🐰 I hopped through rows and catalog caves,
Fetching sizes, slots, and SSL waves.
Bloat and sequences, queries trimmed neat,
Hostnames split, and devices all meet.
Little rabbit cheers: data grows complete.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Addition of diagnostic collectors and bug fixes' accurately reflects the main changes: new collectors, extended collectors, bug fixes, and archive metadata enhancements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description clearly details the changes across new collectors, extended collectors, bug fixes, collection control, and archive metadata, all aligned with the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/additions_202605

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 2, 2026

Up to standards ✅

🟢 Issues 1 medium

Results:
1 new issue

Category Results
Complexity 1 medium

View in Codacy

🟢 Metrics 0 duplication

Metric Results
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@vyruss vyruss changed the title Add diagnostic collectors (bloat, sequences, SSL, slot stats) and extend existing TSVs Add diagnostic collectors May 2, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
radar_test.go (1)

532-559: ⚡ Quick win

Expand coverage checks for changed available_extensions and age fields.

This test is a great guardrail; add assertions for the available_extensions source token and the new computed database age aliases so future query edits don’t silently regress those additions.

🔧 Suggested test additions
  checks := []struct {
      taskList    string
      taskName    string
      mustContain []string
  }{
+     {"postgres", "available_extensions", []string{
+         "pg_available_extension_versions", "installed", "trusted", "requires",
+     }},
      {"postgres", "databases", []string{
          "datfrozenxid", "datminmxid", "datconnlimit",
-         "datistemplate", "datallowconn",
+         "datistemplate", "datallowconn", "frozenxid_age", "minmxid_age",
      }},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@radar_test.go` around lines 532 - 559, Update the test expectations so the
databases coverage includes the new available_extensions source token and the
computed database age aliases: add "available_extensions" to the
{"postgres","databases",...} expected fields and add the age-related alias names
used by the query (the computed database age column(s) labeled in the query,
e.g., "age" and any alias variant) to the appropriate databases/perDB
expectations (look for the {"postgres","databases",...} and
{"perDB","tables"/"indexes"/...} entries and append the new field names).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@radar_test.go`:
- Around line 532-559: Update the test expectations so the databases coverage
includes the new available_extensions source token and the computed database age
aliases: add "available_extensions" to the {"postgres","databases",...} expected
fields and add the age-related alias names used by the query (the computed
database age column(s) labeled in the query, e.g., "age" and any alias variant)
to the appropriate databases/perDB expectations (look for the
{"postgres","databases",...} and {"perDB","tables"/"indexes"/...} entries and
append the new field names).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d9e9d5c8-0d3c-4b73-a361-42f1a8dac090

📥 Commits

Reviewing files that changed from the base of the PR and between 5ae14db and d893c6a.

📒 Files selected for processing (7)
  • DATA.md
  • README.md
  • docs/data.md
  • docs/index.md
  • postgres_tasks.go
  • postgres_test.go
  • radar_test.go

@vyruss vyruss changed the title Add diagnostic collectors Add diagnostic collectors (bloat, sequences, SSL, slot stats), extend existing TSVs, and fix read_ahead May 2, 2026
@vyruss vyruss changed the title Add diagnostic collectors (bloat, sequences, SSL, slot stats), extend existing TSVs, and fix read_ahead Add diagnostic collectors and fix read_ahead May 2, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
system_tasks_linux.go (1)

338-338: ⚡ Quick win

Consider adding /dev/md* (software RAID) to the device glob list.

Enterprise PostgreSQL deployments frequently sit on software RAID arrays (mdadm), and blockdev --getra is a legitimate read-ahead tuning surface for those devices. The same [ -b "$d" ] guard already handles no-match silently.

♻️ Proposed addition
-Args: []string{"-c", "for d in /dev/sd* /dev/nvme* /dev/vd* /dev/xvd*; do [ -b \"$d\" ] && ra=$(blockdev --getra \"$d\" 2>/dev/null) && [ -n \"$ra\" ] && echo \"$d: $ra\"; done; exit 0"},
+Args: []string{"-c", "for d in /dev/sd* /dev/nvme* /dev/vd* /dev/xvd* /dev/md*; do [ -b \"$d\" ] && ra=$(blockdev --getra \"$d\" 2>/dev/null) && [ -n \"$ra\" ] && echo \"$d: $ra\"; done; exit 0"},

Note: /dev/dm-* (LVM/device-mapper) read-ahead is technically valid too, but its behaviour at the DM layer is less commonly tuned and the output can be voluminous on systems with many LVs — leaving it out is a reasonable call.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@system_tasks_linux.go` at line 338, The device glob used in the command
assigned to Args currently iterates /dev/sd*, /dev/nvme*, /dev/vd*, /dev/xvd*
but omits software RAID devices; update the command string (the Args value in
system_tasks_linux.go where Args: []string{"-c", "..."} is defined) to include
/dev/md* in the glob (e.g. add /dev/md* to the list) so blockdev --getra will
also query mdadm devices while keeping the existing [ -b "$d" ] guard and the
rest of the pipeline intact.
radar_test.go (2)

532-559: ⚡ Quick win

available_extensions is missing from the coverage checks despite being reworked in this PR.

The test's stated goal is to "catch accidental column removal during future edits." The available_extensions task was explicitly reworked in this PR to use pg_available_extension_versions, adding installed, trusted, relocatable, requires, and superuser — yet none of those new columns are guarded here.

✅ Suggested addition
 		{"postgres", "databases", []string{
 			"datfrozenxid", "datminmxid", "datconnlimit",
 			"datistemplate", "datallowconn",
 		}},
+		{"postgres", "available_extensions", []string{
+			"pg_available_extension_versions", "installed", "trusted",
+			"relocatable", "requires", "superuser",
+		}},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@radar_test.go` around lines 532 - 559, The test coverage list is missing the
reworked available_extensions check; add a new entry for the
"available_extensions" task in the coverage slice (similar to other {"postgres",
"...", []string{...}} entries) and include the new columns added in this PR:
"installed", "trusted", "relocatable", "requires", and "superuser" (alongside
any existing name/version columns used by available_extensions) so the test will
catch accidental removals; locate the slice in radar_test.go where other
{"postgres", "...", []string{...}} entries are defined and append the
{"postgres", "available_extensions", []string{ "installed", "trusted",
"relocatable", "requires", "superuser", /* include existing columns like
name/version as appropriate */ }} entry.

573-578: ⚡ Quick win

Add a default case to fail clearly on unrecognized taskList values.

Without a default, a typo in taskList (e.g., "Postgres" instead of "postgres") leaves task as nil, and the failure message at line 580 says task "X" not found in Postgres tasks — suggesting the task itself is missing rather than that the list identifier is wrong.

🛡️ Proposed fix
 		switch c.taskList {
 		case "postgres":
 			task = taskByName(postgresQueryTasks, c.taskName)
 		case "perDB":
 			task = taskByName(perDatabaseQueryTasks, c.taskName)
+		default:
+			t.Errorf("unknown taskList %q for task %q", c.taskList, c.taskName)
+			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@radar_test.go` around lines 573 - 578, The switch on c.taskList that assigns
task (using taskByName with postgresQueryTasks or perDatabaseQueryTasks) needs a
default branch to fail fast when c.taskList is unrecognized; add a default case
in that switch which calls the test failure helper (e.g., t.Fatalf or
equivalent) with a clear message stating the invalid taskList value and the
allowed options so typos (like "Postgres") surface as an invalid list rather
than a missing task; ensure the error references c.taskList and the expected
names to aid debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@radar_test.go`:
- Around line 532-559: The test coverage list is missing the reworked
available_extensions check; add a new entry for the "available_extensions" task
in the coverage slice (similar to other {"postgres", "...", []string{...}}
entries) and include the new columns added in this PR: "installed", "trusted",
"relocatable", "requires", and "superuser" (alongside any existing name/version
columns used by available_extensions) so the test will catch accidental
removals; locate the slice in radar_test.go where other {"postgres", "...",
[]string{...}} entries are defined and append the {"postgres",
"available_extensions", []string{ "installed", "trusted", "relocatable",
"requires", "superuser", /* include existing columns like name/version as
appropriate */ }} entry.
- Around line 573-578: The switch on c.taskList that assigns task (using
taskByName with postgresQueryTasks or perDatabaseQueryTasks) needs a default
branch to fail fast when c.taskList is unrecognized; add a default case in that
switch which calls the test failure helper (e.g., t.Fatalf or equivalent) with a
clear message stating the invalid taskList value and the allowed options so
typos (like "Postgres") surface as an invalid list rather than a missing task;
ensure the error references c.taskList and the expected names to aid debugging.

In `@system_tasks_linux.go`:
- Line 338: The device glob used in the command assigned to Args currently
iterates /dev/sd*, /dev/nvme*, /dev/vd*, /dev/xvd* but omits software RAID
devices; update the command string (the Args value in system_tasks_linux.go
where Args: []string{"-c", "..."} is defined) to include /dev/md* in the glob
(e.g. add /dev/md* to the list) so blockdev --getra will also query mdadm
devices while keeping the existing [ -b "$d" ] guard and the rest of the
pipeline intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df0d4cf9-354b-4f46-8328-2672125843fe

📥 Commits

Reviewing files that changed from the base of the PR and between d893c6a and 8fd7d28.

📒 Files selected for processing (5)
  • DATA.md
  • docs/data.md
  • postgres_tasks.go
  • radar_test.go
  • system_tasks_linux.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • DATA.md
  • postgres_tasks.go

@vyruss vyruss changed the title Add diagnostic collectors and fix read_ahead Addition of diagnostic collectors and bug fixes May 2, 2026
coderabbitai[bot]

This comment was marked as outdated.

@vyruss vyruss requested a review from bonesmoses May 2, 2026 09:07
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
radar_test.go (1)

523-562: ⚡ Quick win

Expand query-token coverage for all newly changed SQL surfaces.

TestQueryTaskColumnsCoverage is a great guardrail, but it currently misses tokens for available_extensions, running_activity_maxage, and the newly added frozenxid_age / minmxid_age aliases in databases. Adding these keeps the test aligned with this PR’s query changes.

Suggested patch
 	{"postgres", "databases", []string{
 			"datfrozenxid", "datminmxid", "datconnlimit",
-			"datistemplate", "datallowconn",
+			"datistemplate", "datallowconn",
+			"frozenxid_age", "minmxid_age",
 		}},
+		{"postgres", "available_extensions", []string{
+			"pg_available_extension_versions", "installed", "trusted", "relocatable",
+		}},
+		{"postgres", "running_activity_maxage", []string{
+			"max_query_age", "max_xact_age", "max_backend_age", "max_lock_wait_age",
+			"wait_event_type = 'Lock'",
+		}},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@radar_test.go` around lines 523 - 562, Update TestQueryTaskColumnsCoverage's
checks to include the missing SQL token names introduced in the queries: add
"available_extensions" and "running_activity_maxage" plus the new aliases
"frozenxid_age" and "minmxid_age" to the mustContain list for the
{"postgres","databases"} case in the checks slice so the test asserts those
tokens are present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@radar_test.go`:
- Around line 523-562: Update TestQueryTaskColumnsCoverage's checks to include
the missing SQL token names introduced in the queries: add
"available_extensions" and "running_activity_maxage" plus the new aliases
"frozenxid_age" and "minmxid_age" to the mustContain list for the
{"postgres","databases"} case in the checks slice so the test asserts those
tokens are present.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 45e0aab1-443f-4368-a6f3-90b7a5f864fe

📥 Commits

Reviewing files that changed from the base of the PR and between 53e2d77 and b27bb57.

📒 Files selected for processing (4)
  • DATA.md
  • docs/data.md
  • postgres_tasks.go
  • radar_test.go
✅ Files skipped from review due to trivial changes (2)
  • DATA.md
  • docs/data.md

@vyruss vyruss force-pushed the feat/additions_202605 branch from f5ff3c1 to 333f6a2 Compare May 4, 2026 15:53
@pgEdge pgEdge deleted a comment from coderabbitai Bot May 4, 2026
Copy link
Copy Markdown
Member

@bonesmoses bonesmoses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, if you're going to include pgstattuple collection, even with pgstattuple_approx to avoid full table scans, there needs to be a way to disable or skip specific modules in a certain forensics suite. The mere presence of the extension should not be interpreted as tacit permission to potentially disk-scan every table on the server.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/index.md (1)

193-219: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Output structure example omits radar.out at the ZIP root

writeRadarMeta always writes radar.out to the archive root, but the directory tree here doesn't show it. Readers consulting this section won't know to expect the file.

📝 Proposed addition
 radar-hostname-20260115-133700.zip
+├── radar.out            (binary version + commit SHA)
 ├── system/              (Linux system data)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/index.md` around lines 193 - 219, The example output tree omits the
radar.out file actually written by writeRadarMeta; update the docs' example to
show radar.out at the ZIP root and mention that writeRadarMeta always creates
this file. Locate the documentation block showing the archive tree and add a
top-level entry "radar.out" (with a short note that writeRadarMeta writes
metadata there) so readers know to expect it when they inspect archives created
by writeRadarMeta.
🧹 Nitpick comments (1)
radar_test.go (1)

553-621: ⚡ Quick win

running_activity_maxage extended column not guarded by coverage test

The PR objectives state that running_activity_maxage gains max_lock_wait_age, but TestQueryTaskColumnsCoverage has no entry for it. Future edits could silently drop the column with no test failure.

🛡️ Suggested addition
 checks := []struct {
     taskList    string
     taskName    string
     mustContain []string
 }{
+    {"postgres", "running_activity_maxage", []string{
+        "max_lock_wait_age",
+    }},
     {"postgres", "databases", []string{
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@radar_test.go` around lines 553 - 621, Add a new entry to the checks slice in
TestQueryTaskColumnsCoverage to assert the query for the SimpleQueryTask named
"running_activity_maxage" contains the new token "max_lock_wait_age"; find the
checks array in TestQueryTaskColumnsCoverage and add {"<taskList>",
"running_activity_maxage", []string{"max_lock_wait_age"}} where <taskList>
matches whether that task is defined in postgresQueryTasks or
perDatabaseQueryTasks so the test locates the task by taskByName and fails if
the token is missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@radar.go`:
- Around line 361-366: The include/exclude parsing uses strings.Split on
includeRaw/excludeRaw which preserves whitespace, so tokens like " stat_ssl"
won't match shouldRunTask; update the parsing for cfg.Include and cfg.Exclude
(where includeRaw/excludeRaw are handled) to split on ',' then trim surrounding
whitespace for each token (and ignore empty tokens) before assigning to
cfg.Include/cfg.Exclude, so comparisons in shouldRunTask match correctly.

---

Outside diff comments:
In `@docs/index.md`:
- Around line 193-219: The example output tree omits the radar.out file actually
written by writeRadarMeta; update the docs' example to show radar.out at the ZIP
root and mention that writeRadarMeta always creates this file. Locate the
documentation block showing the archive tree and add a top-level entry
"radar.out" (with a short note that writeRadarMeta writes metadata there) so
readers know to expect it when they inspect archives created by writeRadarMeta.

---

Nitpick comments:
In `@radar_test.go`:
- Around line 553-621: Add a new entry to the checks slice in
TestQueryTaskColumnsCoverage to assert the query for the SimpleQueryTask named
"running_activity_maxage" contains the new token "max_lock_wait_age"; find the
checks array in TestQueryTaskColumnsCoverage and add {"<taskList>",
"running_activity_maxage", []string{"max_lock_wait_age"}} where <taskList>
matches whether that task is defined in postgresQueryTasks or
perDatabaseQueryTasks so the test locates the task by taskByName and fails if
the token is missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 84a5b0ef-9802-456d-ab2e-5e6356e6983a

📥 Commits

Reviewing files that changed from the base of the PR and between b27bb57 and 49c1838.

📒 Files selected for processing (6)
  • DATA.md
  • README.md
  • docs/data.md
  • docs/index.md
  • radar.go
  • radar_test.go
✅ Files skipped from review due to trivial changes (1)
  • docs/data.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • README.md
  • DATA.md

Comment thread radar.go Outdated
@vyruss
Copy link
Copy Markdown
Collaborator Author

vyruss commented May 5, 2026

IMO, if you're going to include pgstattuple collection, even with pgstattuple_approx to avoid full table scans, there needs to be a way to disable or skip specific modules in a certain forensics suite. The mere presence of the extension should not be interpreted as tacit permission to potentially disk-scan every table on the server.

@bonesmoses Completely fair, made it disabled by default!

@vyruss vyruss force-pushed the feat/additions_202605 branch from 49c1838 to a7c6196 Compare May 5, 2026 12:20
@vyruss vyruss requested a review from bonesmoses May 5, 2026 12:20
Copy link
Copy Markdown
Member

@bonesmoses bonesmoses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly enthusiastic about the potential O(n-squared) resolution of the include/exclude functionality for the existing task lists in shouldRunTask, but these lists are likely to be relatively short, so there's little risk here. I'd probably just bootstrap the exclude list rather than create a separate default list, but that's a minor nitpick.

Everything else in this looks fairly straight-forward, though I might urge some caution on calling functions like pg_relation_size and pg_table_size rather than using relpages by default. I've seen particularly large clusters with many thousands of tables caused delays due to the underlying fstat calls these functions use. That's another reason I urged the opt-out; in an emergency, modules that could derail a radar snapshot can now be excluded.

@vyruss vyruss force-pushed the feat/additions_202605 branch from a7c6196 to eaba614 Compare May 8, 2026 09:47
@vyruss vyruss force-pushed the feat/additions_202605 branch from 2a5f80b to 2b4b4ba Compare May 8, 2026 10:08
@vyruss
Copy link
Copy Markdown
Collaborator Author

vyruss commented May 8, 2026

I'm not exactly enthusiastic about the potential O(n-squared) resolution of the include/exclude functionality for the existing task lists in shouldRunTask, but these lists are likely to be relatively short, so there's little risk here. I'd probably just bootstrap the exclude list rather than create a separate default list, but that's a minor nitpick.

Everything else in this looks fairly straight-forward, though I might urge some caution on calling functions like pg_relation_size and pg_table_size rather than using relpages by default. I've seen particularly large clusters with many thousands of tables caused delays due to the underlying fstat calls these functions use. That's another reason I urged the opt-out; in an emergency, modules that could derail a radar snapshot can now be excluded.

@bonesmoses Yeah you're right about shouldRunTask, computationally it's pretty insignificant, but the code WAS a bit embarrassing :) Now I'm doing a slice membership check only once at the beginning (for readability).

About table & index sizes: Although I think the -exclude caveat can help users get over it, the extremely long time to fstat 100K tables would be surprising and probably unacceptable. With relpages, the problem is that, if a table has never been ANALYZEd since creation, it will be inaccurate. However, in most other cases it should be pretty close to the truth. So I've fixed it by: if > 1000 tables, sort by relpages and use that instead of fstat. If the table hasn't been analyzed, then do fstat. We are enforcing a hard 1000 table limit in the output anyway. Thanks!

@vyruss vyruss merged commit d7f21f5 into main May 8, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants