feat(igc): add on-demand netdump observability#685
feat(igc): add on-demand netdump observability#685ytakano wants to merge 3 commits intotier4:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an on-demand network device “netdump” observability path by introducing a NetDevice::debug_dump() hook, exposing it via awkernel_lib::net::debug_dump_interface(), and wiring it into the Awkernel shell as (netdump interface_id).
Changes:
- Add a default
debug_dump()method to theNetDevicetrait. - Add
awkernel_lib::net::debug_dump_interface(interface_id)to trigger a device dump for a specific interface. - Add a
netdumpshell command and FFI plumbing (with new bigint conversion deps) to invoke the interface dump.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
awkernel_lib/src/net/net_device.rs |
Introduces a default NetDevice::debug_dump() hook. |
awkernel_lib/src/net.rs |
Adds debug_dump_interface() entry point in the net manager layer. |
awkernel_drivers/src/pcie/intel/igc.rs |
Wires Igc’s NetDevice::debug_dump() to existing inner dump logic. |
applications/awkernel_shell/src/lib.rs |
Adds (netdump interface_id) BLisp export and embedded FFI handler. |
applications/awkernel_shell/Cargo.toml |
Adds num-bigint / num-traits dependencies for the new FFI argument type. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Clone the selected network device while holding the net manager read lock, then release the lock before invoking the debug dump path. This keeps potentially slow diagnostic dumping outside the shared manager lock and matches the existing interface operation pattern.
| } | ||
|
|
||
| fn debug_dump(&self) { | ||
| self.inner.read().dump(); |
There was a problem hiding this comment.
debug_dump() holds self.inner.read() for the entire dump, which performs many MMIO register reads, an O(N²) format!("{msg}...") build, and a console-locked log write. Concurrent inner.write() callers (up, down, add_multicast_addr, the LSC/poll-link path in intr) block for that whole time, and on writer-preferring RwLocks subsequent read() callers in the datapath (tick/recv/send/can_send) also stall — which contradicts the PR description's claim of not affecting interrupt/queue behavior.
Consider capturing the small subset of state needed under the lock and releasing it before the format/log work, or exposing the fields actually used (info, hw.mac.addr) for lock-free access so the dump matches the read-only intent.
| } | ||
|
|
||
| /// Dump device-specific debug state on demand. | ||
| fn debug_dump(&self) {} |
There was a problem hiding this comment.
Defaulting to an empty body means (netdump <id>) silently returns Ok(()) for every NetDevice that hasn't overridden this — VirtioNet, Genet, Igb, Ixgbe, E1000eExample. The shell user sees no output and no indication the operation was unsupported, which is misleading observability behavior.
Consider returning Result<(), NetDevError> with a default of Err(NetDevError::Unsupported) so debug_dump_interface can surface it, or have the default emit a clear "debug_dump not implemented for " log line via device_short_name().
There was a problem hiding this comment.
Print warning when debug_dump is not supported as follows.
log::warn!("debug_dump not implemented for this device");
| Ok(if_status) | ||
| } | ||
|
|
||
| pub fn debug_dump_interface(interface_id: u64) -> Result<(), NetManagerError> { |
There was a problem hiding this comment.
This new public API has no doc comment, while other pub fns in this module (get_interface, up, down, tick_interface, ...) document behavior and error conditions. A caller cannot tell where the dump is emitted (the IGC implementation uses log::debug!, but that is not part of the contract), what error variants can be returned, or whether it is safe to invoke during active TX/RX.
Please add a /// comment specifying the output channel, the returned errors (today only InvalidInterfaceID), and any timing constraints.
There was a problem hiding this comment.
Add a doc comment as follows.
/// Emit debug state for the interface identified by `interface_id` via `log::debug!`.
///
/// Returns [`NetManagerError::InvalidInterfaceID`] if no interface with that ID exists.
/// The NET_MANAGER read lock is held only to look up and clone the device reference;
/// the device dump runs outside that lock.
Signed-off-by: Yuuki Takano <ytakanoster@gmail.com>
Description
This PR adds on-demand IGC observability without changing datapath behavior.
It introduces a default NetDevice::debug_dump() hook, adds awkernel_lib::net::debug_dump_interface(), and wires the shell netdump(interface_id) command to the IGC driver’s existing register and ring dump logic
Related links
How was this PR tested?
Notes for reviewers
This PR is intentionally limited to read-only observability. It does not change queueing, DMA, interrupt handling, or RX buffering behavior.
The shell addition is only netdump(interface_id). Write-path helpers such as add_ipv4, arping4, and set_gateway4 are not included here.