Addition of diagnostic collectors and bug fixes#6
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDocs, CLI, and collection logic updated: archive now includes a ChangesPostgreSQL Data Collection Expansion
System Tasks & Hostname Reporting
Task Selection, Archive Metadata & CLI
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
| Category | Results |
|---|---|
| Complexity | 1 medium |
🟢 Metrics 0 duplication
Metric Results Duplication 0
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
radar_test.go (1)
532-559: ⚡ Quick winExpand coverage checks for changed
available_extensionsand age fields.This test is a great guardrail; add assertions for the
available_extensionssource token and the new computed database age aliases so future query edits don’t silently regress those additions.🔧 Suggested test additions
checks := []struct { taskList string taskName string mustContain []string }{ + {"postgres", "available_extensions", []string{ + "pg_available_extension_versions", "installed", "trusted", "requires", + }}, {"postgres", "databases", []string{ "datfrozenxid", "datminmxid", "datconnlimit", - "datistemplate", "datallowconn", + "datistemplate", "datallowconn", "frozenxid_age", "minmxid_age", }},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@radar_test.go` around lines 532 - 559, Update the test expectations so the databases coverage includes the new available_extensions source token and the computed database age aliases: add "available_extensions" to the {"postgres","databases",...} expected fields and add the age-related alias names used by the query (the computed database age column(s) labeled in the query, e.g., "age" and any alias variant) to the appropriate databases/perDB expectations (look for the {"postgres","databases",...} and {"perDB","tables"/"indexes"/...} entries and append the new field names).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@radar_test.go`:
- Around line 532-559: Update the test expectations so the databases coverage
includes the new available_extensions source token and the computed database age
aliases: add "available_extensions" to the {"postgres","databases",...} expected
fields and add the age-related alias names used by the query (the computed
database age column(s) labeled in the query, e.g., "age" and any alias variant)
to the appropriate databases/perDB expectations (look for the
{"postgres","databases",...} and {"perDB","tables"/"indexes"/...} entries and
append the new field names).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d9e9d5c8-0d3c-4b73-a361-42f1a8dac090
📒 Files selected for processing (7)
DATA.mdREADME.mddocs/data.mddocs/index.mdpostgres_tasks.gopostgres_test.goradar_test.go
There was a problem hiding this comment.
🧹 Nitpick comments (3)
system_tasks_linux.go (1)
338-338: ⚡ Quick winConsider adding
/dev/md*(software RAID) to the device glob list.Enterprise PostgreSQL deployments frequently sit on software RAID arrays (
mdadm), andblockdev --getrais a legitimate read-ahead tuning surface for those devices. The same[ -b "$d" ]guard already handles no-match silently.♻️ Proposed addition
-Args: []string{"-c", "for d in /dev/sd* /dev/nvme* /dev/vd* /dev/xvd*; do [ -b \"$d\" ] && ra=$(blockdev --getra \"$d\" 2>/dev/null) && [ -n \"$ra\" ] && echo \"$d: $ra\"; done; exit 0"}, +Args: []string{"-c", "for d in /dev/sd* /dev/nvme* /dev/vd* /dev/xvd* /dev/md*; do [ -b \"$d\" ] && ra=$(blockdev --getra \"$d\" 2>/dev/null) && [ -n \"$ra\" ] && echo \"$d: $ra\"; done; exit 0"},Note:
/dev/dm-*(LVM/device-mapper) read-ahead is technically valid too, but its behaviour at the DM layer is less commonly tuned and the output can be voluminous on systems with many LVs — leaving it out is a reasonable call.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@system_tasks_linux.go` at line 338, The device glob used in the command assigned to Args currently iterates /dev/sd*, /dev/nvme*, /dev/vd*, /dev/xvd* but omits software RAID devices; update the command string (the Args value in system_tasks_linux.go where Args: []string{"-c", "..."} is defined) to include /dev/md* in the glob (e.g. add /dev/md* to the list) so blockdev --getra will also query mdadm devices while keeping the existing [ -b "$d" ] guard and the rest of the pipeline intact.radar_test.go (2)
532-559: ⚡ Quick win
available_extensionsis missing from the coverage checks despite being reworked in this PR.The test's stated goal is to "catch accidental column removal during future edits." The
available_extensionstask was explicitly reworked in this PR to usepg_available_extension_versions, addinginstalled,trusted,relocatable,requires, andsuperuser— yet none of those new columns are guarded here.✅ Suggested addition
{"postgres", "databases", []string{ "datfrozenxid", "datminmxid", "datconnlimit", "datistemplate", "datallowconn", }}, + {"postgres", "available_extensions", []string{ + "pg_available_extension_versions", "installed", "trusted", + "relocatable", "requires", "superuser", + }},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@radar_test.go` around lines 532 - 559, The test coverage list is missing the reworked available_extensions check; add a new entry for the "available_extensions" task in the coverage slice (similar to other {"postgres", "...", []string{...}} entries) and include the new columns added in this PR: "installed", "trusted", "relocatable", "requires", and "superuser" (alongside any existing name/version columns used by available_extensions) so the test will catch accidental removals; locate the slice in radar_test.go where other {"postgres", "...", []string{...}} entries are defined and append the {"postgres", "available_extensions", []string{ "installed", "trusted", "relocatable", "requires", "superuser", /* include existing columns like name/version as appropriate */ }} entry.
573-578: ⚡ Quick winAdd a
defaultcase to fail clearly on unrecognizedtaskListvalues.Without a
default, a typo intaskList(e.g.,"Postgres"instead of"postgres") leavestaskasnil, and the failure message at line 580 saystask "X" not found in Postgres tasks— suggesting the task itself is missing rather than that the list identifier is wrong.🛡️ Proposed fix
switch c.taskList { case "postgres": task = taskByName(postgresQueryTasks, c.taskName) case "perDB": task = taskByName(perDatabaseQueryTasks, c.taskName) + default: + t.Errorf("unknown taskList %q for task %q", c.taskList, c.taskName) + continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@radar_test.go` around lines 573 - 578, The switch on c.taskList that assigns task (using taskByName with postgresQueryTasks or perDatabaseQueryTasks) needs a default branch to fail fast when c.taskList is unrecognized; add a default case in that switch which calls the test failure helper (e.g., t.Fatalf or equivalent) with a clear message stating the invalid taskList value and the allowed options so typos (like "Postgres") surface as an invalid list rather than a missing task; ensure the error references c.taskList and the expected names to aid debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@radar_test.go`:
- Around line 532-559: The test coverage list is missing the reworked
available_extensions check; add a new entry for the "available_extensions" task
in the coverage slice (similar to other {"postgres", "...", []string{...}}
entries) and include the new columns added in this PR: "installed", "trusted",
"relocatable", "requires", and "superuser" (alongside any existing name/version
columns used by available_extensions) so the test will catch accidental
removals; locate the slice in radar_test.go where other {"postgres", "...",
[]string{...}} entries are defined and append the {"postgres",
"available_extensions", []string{ "installed", "trusted", "relocatable",
"requires", "superuser", /* include existing columns like name/version as
appropriate */ }} entry.
- Around line 573-578: The switch on c.taskList that assigns task (using
taskByName with postgresQueryTasks or perDatabaseQueryTasks) needs a default
branch to fail fast when c.taskList is unrecognized; add a default case in that
switch which calls the test failure helper (e.g., t.Fatalf or equivalent) with a
clear message stating the invalid taskList value and the allowed options so
typos (like "Postgres") surface as an invalid list rather than a missing task;
ensure the error references c.taskList and the expected names to aid debugging.
In `@system_tasks_linux.go`:
- Line 338: The device glob used in the command assigned to Args currently
iterates /dev/sd*, /dev/nvme*, /dev/vd*, /dev/xvd* but omits software RAID
devices; update the command string (the Args value in system_tasks_linux.go
where Args: []string{"-c", "..."} is defined) to include /dev/md* in the glob
(e.g. add /dev/md* to the list) so blockdev --getra will also query mdadm
devices while keeping the existing [ -b "$d" ] guard and the rest of the
pipeline intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: df0d4cf9-354b-4f46-8328-2672125843fe
📒 Files selected for processing (5)
DATA.mddocs/data.mdpostgres_tasks.goradar_test.gosystem_tasks_linux.go
🚧 Files skipped from review as they are similar to previous changes (2)
- DATA.md
- postgres_tasks.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
radar_test.go (1)
523-562: ⚡ Quick winExpand query-token coverage for all newly changed SQL surfaces.
TestQueryTaskColumnsCoverageis a great guardrail, but it currently misses tokens foravailable_extensions,running_activity_maxage, and the newly addedfrozenxid_age/minmxid_agealiases indatabases. Adding these keeps the test aligned with this PR’s query changes.Suggested patch
{"postgres", "databases", []string{ "datfrozenxid", "datminmxid", "datconnlimit", - "datistemplate", "datallowconn", + "datistemplate", "datallowconn", + "frozenxid_age", "minmxid_age", }}, + {"postgres", "available_extensions", []string{ + "pg_available_extension_versions", "installed", "trusted", "relocatable", + }}, + {"postgres", "running_activity_maxage", []string{ + "max_query_age", "max_xact_age", "max_backend_age", "max_lock_wait_age", + "wait_event_type = 'Lock'", + }},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@radar_test.go` around lines 523 - 562, Update TestQueryTaskColumnsCoverage's checks to include the missing SQL token names introduced in the queries: add "available_extensions" and "running_activity_maxage" plus the new aliases "frozenxid_age" and "minmxid_age" to the mustContain list for the {"postgres","databases"} case in the checks slice so the test asserts those tokens are present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@radar_test.go`:
- Around line 523-562: Update TestQueryTaskColumnsCoverage's checks to include
the missing SQL token names introduced in the queries: add
"available_extensions" and "running_activity_maxage" plus the new aliases
"frozenxid_age" and "minmxid_age" to the mustContain list for the
{"postgres","databases"} case in the checks slice so the test asserts those
tokens are present.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 45e0aab1-443f-4368-a6f3-90b7a5f864fe
📒 Files selected for processing (4)
DATA.mddocs/data.mdpostgres_tasks.goradar_test.go
✅ Files skipped from review due to trivial changes (2)
- DATA.md
- docs/data.md
f5ff3c1 to
333f6a2
Compare
bonesmoses
left a comment
There was a problem hiding this comment.
IMO, if you're going to include pgstattuple collection, even with pgstattuple_approx to avoid full table scans, there needs to be a way to disable or skip specific modules in a certain forensics suite. The mere presence of the extension should not be interpreted as tacit permission to potentially disk-scan every table on the server.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/index.md (1)
193-219:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winOutput structure example omits
radar.outat the ZIP root
writeRadarMetaalways writesradar.outto the archive root, but the directory tree here doesn't show it. Readers consulting this section won't know to expect the file.📝 Proposed addition
radar-hostname-20260115-133700.zip +├── radar.out (binary version + commit SHA) ├── system/ (Linux system data)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/index.md` around lines 193 - 219, The example output tree omits the radar.out file actually written by writeRadarMeta; update the docs' example to show radar.out at the ZIP root and mention that writeRadarMeta always creates this file. Locate the documentation block showing the archive tree and add a top-level entry "radar.out" (with a short note that writeRadarMeta writes metadata there) so readers know to expect it when they inspect archives created by writeRadarMeta.
🧹 Nitpick comments (1)
radar_test.go (1)
553-621: ⚡ Quick win
running_activity_maxageextended column not guarded by coverage testThe PR objectives state that
running_activity_maxagegainsmax_lock_wait_age, butTestQueryTaskColumnsCoveragehas no entry for it. Future edits could silently drop the column with no test failure.🛡️ Suggested addition
checks := []struct { taskList string taskName string mustContain []string }{ + {"postgres", "running_activity_maxage", []string{ + "max_lock_wait_age", + }}, {"postgres", "databases", []string{🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@radar_test.go` around lines 553 - 621, Add a new entry to the checks slice in TestQueryTaskColumnsCoverage to assert the query for the SimpleQueryTask named "running_activity_maxage" contains the new token "max_lock_wait_age"; find the checks array in TestQueryTaskColumnsCoverage and add {"<taskList>", "running_activity_maxage", []string{"max_lock_wait_age"}} where <taskList> matches whether that task is defined in postgresQueryTasks or perDatabaseQueryTasks so the test locates the task by taskByName and fails if the token is missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@radar.go`:
- Around line 361-366: The include/exclude parsing uses strings.Split on
includeRaw/excludeRaw which preserves whitespace, so tokens like " stat_ssl"
won't match shouldRunTask; update the parsing for cfg.Include and cfg.Exclude
(where includeRaw/excludeRaw are handled) to split on ',' then trim surrounding
whitespace for each token (and ignore empty tokens) before assigning to
cfg.Include/cfg.Exclude, so comparisons in shouldRunTask match correctly.
---
Outside diff comments:
In `@docs/index.md`:
- Around line 193-219: The example output tree omits the radar.out file actually
written by writeRadarMeta; update the docs' example to show radar.out at the ZIP
root and mention that writeRadarMeta always creates this file. Locate the
documentation block showing the archive tree and add a top-level entry
"radar.out" (with a short note that writeRadarMeta writes metadata there) so
readers know to expect it when they inspect archives created by writeRadarMeta.
---
Nitpick comments:
In `@radar_test.go`:
- Around line 553-621: Add a new entry to the checks slice in
TestQueryTaskColumnsCoverage to assert the query for the SimpleQueryTask named
"running_activity_maxage" contains the new token "max_lock_wait_age"; find the
checks array in TestQueryTaskColumnsCoverage and add {"<taskList>",
"running_activity_maxage", []string{"max_lock_wait_age"}} where <taskList>
matches whether that task is defined in postgresQueryTasks or
perDatabaseQueryTasks so the test locates the task by taskByName and fails if
the token is missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 84a5b0ef-9802-456d-ab2e-5e6356e6983a
📒 Files selected for processing (6)
DATA.mdREADME.mddocs/data.mddocs/index.mdradar.goradar_test.go
✅ Files skipped from review due to trivial changes (1)
- docs/data.md
🚧 Files skipped from review as they are similar to previous changes (2)
- README.md
- DATA.md
@bonesmoses Completely fair, made it disabled by default! |
49c1838 to
a7c6196
Compare
bonesmoses
left a comment
There was a problem hiding this comment.
I'm not exactly enthusiastic about the potential O(n-squared) resolution of the include/exclude functionality for the existing task lists in shouldRunTask, but these lists are likely to be relatively short, so there's little risk here. I'd probably just bootstrap the exclude list rather than create a separate default list, but that's a minor nitpick.
Everything else in this looks fairly straight-forward, though I might urge some caution on calling functions like pg_relation_size and pg_table_size rather than using relpages by default. I've seen particularly large clusters with many thousands of tables caused delays due to the underlying fstat calls these functions use. That's another reason I urged the opt-out; in an emergency, modules that could derail a radar snapshot can now be excluded.
a7c6196 to
eaba614
Compare
2a5f80b to
2b4b4ba
Compare
@bonesmoses Yeah you're right about About table & index sizes: Although I think the |
Summary
Adds 5 new diagnostic collectors and extends 5 existing collectors with columns supporting deeper analysis: bloat, vacuum health, wraparound headroom, index validity/usage, sequence headroom, SSL audit, replication slot spill, extension version metadata. Also fixes the
read_aheadsystem collector and splits thehostnamecollector to disambiguate kernel hostname from resolver-canonical FQDN.New collectors:
stat_ssl(postgresql/) - per-backend SSL/TLS state frompg_stat_sslstat_replication_slots(postgresql/) - logical replication slot spill counters frompg_stat_replication_slots(PG14+)sequences(databases/{db}/) - sequences with last/min/max values frompg_sequencesbloat(databases/{db}/) - heuristic table bloat estimate frompg_statspgstattuple(databases/{db}/) - authoritative bloat viapgstattuple_approx()when the extension is installed; disabled by default, run with-include pgstattupleExtended collectors:
databases-datfrozenxid,datminmxid,datconnlimit,datistemplate,datallowconnrunning_activity_maxage- addsmax_lock_wait_age(longest current lock wait, NULL when nobody is waiting on a lock)tables- rebuilt onpg_class+pg_stat_all_tablesfor vacuum/analyze timestamps and ages in seconds, dead-tup counters,reloptions,relpersistence,relpages, heap size, table size,reltoastrelid/toast size; capped at top 1000 by sizeindexes- rebuilt onpg_index+pg_stat_all_indexesfor semantic-key columns,indisvalid,idx_scan/idx_tup_read/idx_tup_fetch,relpages, index size; capped at top 1000 by sizeavailable_extensions- now sourced frompg_available_extension_versionsfor per-versioninstalled/trusted/superuser/relocatable/requiresmetadataBug fixes:
read_aheadsystem collector now handles virtio (/dev/vd*) and xen (/dev/xvd*) block devices and gracefully skips unmatched globs; output is now labeled per device.hostnamesystem collector split:system/hostname.outnow contains plainhostname(kernel hostname /gethostname()), and a newsystem/hostname_fqdn.outcontainshostname -f(resolver canonical name).Collection control:
-exclude name1,name2flag skips named tasks (works for both instance-level and per-database tasks).-include name1,name2flag enables default-disabled tasks. Currently the only default-disabled task ispgstattuple, becausepgstattuple_approx()reads heap pages of every user table and shouldn't auto-fire just because the extension is present.Archive metadata:
radar.out(archive root) - identifies the radar binary that produced the archive:version(from the build-time-X main.versionstamp) andcommit(short git SHA).Checklist
TestQueryTaskColumnsCoverageasserts new column tokens andLIMIT 1000stay in the queries;TestPostgreSQLCollectorscoversstat_sslandstat_replication_slots)README.md,docs/index.md,DATA.md,docs/data.md)pgstattuplecorrectly skipped when extension absent;stat_replication_slotsPG14+ guarded by existing42P01SkipError path)