feat(vm): cocoon vm net --nics N — runtime NIC resize#42
Merged
Conversation
Add takes variadic AddSpec (per-NIC index + optional Existing for recovery). Remove tears down individual NIC plumbing by index, preserves netns. Extracted cniDel helper shared by Add (rollback+recovery), Remove, delNICs.
CNI: tearDownNICs(records, deleteTAP, bestEffort) → ids; both Delete and Remove go through it. deleteRecords sweeps DB by id in one Update. delNICs stays for gc.Collect (different lock context). Bridge: tearDownTAPs(indices, bestEffort) shared by Remove and CleanupTAPs.
NetResize dispatches to per-NIC add or LIFO remove. Add path: plumbing.Add → vm.add-net with deterministic cocoon-net-<mac> id → DB append; rolls back the current NIC's host plumbing if vm.add-net fails. Remove path: build MAC→id map from vm.info → vm.remove-device → optional plumbing.Remove (skipped via --keep-host-on-remove) → DB truncate. Persists per NIC so a mid-op failure leaves the record matching CH state.
CLI for resizing a running VM's NIC count. Wires the resolved Resizer (currently CH only) with the matching host plumbing. Refactors resolveAttacher to also return conf + hypervisor.Hypervisor so NetResize can call Inspect to discover the active backend without re-resolving. Hoists "bridge" / "cni" backend literals to types.BackendBridge / types.BackendCNI so cmd, network/cni, and network/bridge share one identifier.
After the remove-device loop, re-fetch vm.info and warn for any old ID still present in info.Config.Nets. The warning lets us trace bad clone states without failing the flow (the eject pending state is a CH-side condition the user accepted). Also gitignore /scratch/ for local test scripts.
Signal isn't useful in the paused-clone window: vm.info reflects device_tree which only updates when guest writes B0EJ via SCI; during pause B0EJ never fires, so the check would always warn. The genuine remove-failure case (unknown id) already errors out of removeDeviceVM.
Flag had no proper lifecycle: host TAP/veth/CNI would leak, then a later resize-up at the same index would collide. Closing it cleanly needs orphan tracking (pending-cleanup state on the VM record). Out of scope for v1 — drop the flag entirely and always tear down host plumbing after vm.remove-device. Bring it back later with full pending-state tracking if a real user needs it.
There was a problem hiding this comment.
Pull request overview
Adds runtime NIC hot-resize support for running Cloud Hypervisor VMs via a new cocoon vm net --nics N VM command, and refactors the network provider interface to support per-NIC add/remove operations needed by hot-plug workflows.
Changes:
- Introduces
vm netCLI +extend/netresizeinterface and implements CH-side resize logic (vm.add-net/vm.remove-device) with incremental DB persistence. - Refactors
network.Networkfrom bulkConfig(...)to per-NICAdd(...)/Remove(...), updating CNI and bridge backends accordingly. - Documents operational limitations/risks (pending eject in CH; guest must quiesce NICs) in README + KNOWN_ISSUES.
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| types/network.go | Adds typed constants for persisted network backend identifiers. |
| README.md | Documents the new vm net command and its operational caveats. |
| network/network.go | Refactors network provider interface to per-NIC Add/Remove and adds helpers for ranges/recovery. |
| network/cni/create.go | Reworks CNI setup into Add(specs...) and adds Remove(indices...). |
| network/cni/create_linux.go | Adds helper to delete TAPs inside a netns (for per-NIC removal). |
| network/cni/create_darwin.go | Adds darwin stub for TAP deletion helper. |
| network/cni/cni.go | Refactors Delete flow to share teardown helper and adds record deletion helper. |
| network/bridge/bridge_other.go | Updates non-linux bridge implementation to new interface (unsupported stubs). |
| network/bridge/bridge_linux.go | Implements bridge Add/Remove and refactors cleanup helpers. |
| KNOWN_ISSUES.md | Documents CH pending-eject behavior and implications during NIC hot-remove. |
| hypervisor/cloudhypervisor/netresize.go | Implements CH runtime NIC resize orchestration + persistence. |
| hypervisor/cloudhypervisor/extend.go | Registers CH as implementing the new netresize.Resizer interface. |
| hypervisor/cloudhypervisor/args.go | Adds deterministic CH device ID helper for cocoon-managed NICs. |
| hypervisor/cloudhypervisor/api.go | Extends vm.info config struct to include net devices. |
| extend/netresize/netresize.go | Introduces netresize API (Spec/Result/Plumbing/Resizer). |
| cmd/vm/run.go | Updates VM creation networking path to use Network.Add helpers. |
| cmd/vm/netresize.go | Implements cocoon vm net command handler and provider selection. |
| cmd/vm/lifecycle.go | Updates network recovery and provider selection to new interface/constants. |
| cmd/vm/commands.go | Wires new vm net subcommand and --nics flag. |
| cmd/vm/attach.go | Refactors shared resolveAttacher to also return config + hypervisor for vm net. |
| .gitignore | Ignores local /scratch/ directory. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…vider lookup - runningVMClientWithRecord exposes (hc, vmID, rec) so NetResize avoids the duplicate ResolveRef+LoadRecord that inspectRunning already triggered. - plumbingForVM delegates to providerForVM (now lazy-inits CNI when nil), eliminating the duplicated backend switch / BridgeDev check. - hotSwapNets sets the deterministic cocoonNetID(MAC) so clone-added NICs match netresize's id convention (both still resolved by MAC, but consistent in vm.info output). - Trim netresize package comment; the operational caveats already live in README + KNOWN_ISSUES.
- netResizeAdd: if appendNetworkConfig fails after addNetVM succeeds, roll back vm.remove-device + host plumbing. Without this, the live NIC with cocoon-net-<mac> id stays in CH while cocoon's record drops, and the next resize collides on the deterministic id. - netResizeRemove: once removeDeviceVM succeeds the CH eject is irrevocable, so truncate the cocoon record even when plumbing.Remove fails (leaked TAP/CNI is gc'd later). Skipping truncate left the next resize unable to resolve the stale NIC's MAC against vm.info.
- cni.tearDownNICs: when cniConf is nil and bestEffort, return all record ids so deleteVM still sweeps the DB instead of stranding them. In the non-bestEffort path (Remove), CNI DEL / TAP delete failures now propagate — silently logging would leak IPAM and report a misleading success. - bridge.Add: track successfully-created TAPs and tear them down on any later failure. Reachable from boot (AddRange(0, nics)) where a partial failure previously leaked the earlier-created TAPs on the bridge.
"gc will reclaim" is misleading — running-VM mid-state leaks aren't in the GC orphan scan; only stop+restart sweeps them. Adjust the warn message accordingly.
- CNI Add: rollback defer now deletes TAPs left behind by setupTCRedirect when the netns was pre-existing (createdNetns=false). Without this, a partial failure inside the persistent netns leaked tap<prefix>-<idx>, and the next retry hit "file exists" in createTAP. - NetResize: defer getVMInfo to the remove branch only; the add path doesn't need vm.info, so an add resize no longer depends on it.
Result gains Warnings []string. netResizeRemove appends a leak message when plumbing.Remove fails after CH accepted the eject; the CLI logs each warning and JSON output carries them. Without this, host plumbing leaks were only visible in the daemon log.
After CNI.Remove was introduced for hot-resize, a VM resized down to 0 NICs ends up with empty DB records but its netns intact (Remove preserves the netns by design). vm rm then short-circuited on len(records)==0 and left /var/run/netns/cocoon-<vmID> behind until gc ran. Drop the early return so deleteNetns runs unconditionally (idempotent: ignores ENOENT).
vm stop/start does not run network teardown; only cocoon vm rm (via Network.Delete) or gc reclaim the host plumbing. Replace the misleading 'stop+restart will reclaim' hint.
tearDownNICs now attempts every cleanup step per record (CNI DEL + TAP delete) instead of bailing on the first failure, so the leaked TAP no longer blocks a later resize-up at the same index. CNI.Remove sweeps DB records for every picked index regardless of teardown outcome, matching netResizeRemove's "CH eject is irrevocable, truncate even if plumbing leaks" invariant. The teardown error still propagates so the caller can surface the IPAM leak via Result.Warnings.
errors.Join surfaces a deleteRecords failure even when tearDownNICs also failed — the two indicate different problems (host teardown vs DB divergence) and shouldn't mask each other.
deleteVM previously relied on tearDownNICs' returned ids, so a record whose persisted conflist name no longer resolves was logged-and-skipped inside tearDownNICs and never deleted from the DB. Sweep all records of this VM after teardown instead. Drops the unused []string return from tearDownNICs since CNI.Remove also pre-collects pickedIDs.
Per project rule "never dereference pointers without nil check". Both call sites do nc.MAC; a nil entry in rec.NetworkConfigs would panic. Surface a clear error instead.
After vm.remove-device CH only flags pending eject; the device stays in device_tree until the guest writes B0EJ via ACPI/SCI. Pause/snapshot iterate device_tree and error out with "receiving on a closed channel" (VMM-thread response channel collapse) when called too soon. Poll vm.info's device_tree for absence of the chID up to 15 s before teardown. Linux acks in < 1 s; Windows can take a few seconds. The wait is scoped to netresize remove — clone's hotSwapNets runs paused (guest can't ACK) and we accept the existing CH limitation there.
…0s timeout Two findings from final closure review: 1. netResizeAdd's persist-failure rollback called removeDeviceVM then plumbing.Remove with no wait between. Symmetric to the resize-down race that a3d41cc fixed: a fast plumbing.Remove can yank the TAP while CH's virtio-net ioeventfd is still attached. Mirror the same waitDeviceEjected bracket so the rollback can't recreate the channel-closed condition. 2. 5 s ejectWaitTimeout was tight for Windows. Bump to 10 s — Linux still acks in < 1 s; Windows NDIS can take a few.
Collapsed 8 multi-liners I added (resolveAttacher, plumbingForVM, providerForVM, Result, addCocoonNIC, ejectWaitTimeout, the symmetric- rollback inline, tearDownNICs, Remove) into single lines. Pre-existing multi-line comments left alone.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cocoon vm net --nics N <vm>resizes the NIC count on a running VM (CH only). To add, cocoon allocates fresh TAP/CNI/bridge plumbing and callsvm.add-net; to remove, it pops NICs from the tail (LIFO) viavm.remove-deviceand tears down host plumbing.network.Networkinterface refactor:Config(numNICs, existing...)→ variadicAdd(specs ...AddSpec)/Remove(indices ...int)for symmetric NIC-level ops;DeleteandRemoveshare atearDownhelper.Design decisions
vm.remove-devicereturns — CH's HTTP API has no reliable eject signal (the slot/ioeventfd cleanup only happens when the guest writes B0EJ via ACPI/SCI). User accepts in-guest driver hang risk and is responsible for quiescing the NIC before reducing the count.netresize.ErrUnsupportedBackend.cocoon-net-<mac-no-colons>) — not persisted intypes.NetworkConfig; remove path looks up live IDs viavm.info+ MAC match so boot-time NICs (CH auto_netN) and cocoon-added NICs both work.clone.hotSwapNetsuses the same id convention via the sharedaddCocoonNIChelper.netResizeAddrolls back CH + plumbing on persistence failure;netResizeRemovealways truncates the cocoon record once CH accepts the eject (leaked host plumbing is reclaimed by gc /cocoon vm rm) so the next resize can converge.Test plan
make lint0 issues on linux/amd64 + darwin/amd64go test ./...all greenvm net --nics 1returnsErrUnsupportedBackend