Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion architecture/compute-runtimes.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Each runtime receives a sandbox spec from the gateway and is responsible for:
| Docker | Local development with Docker available. | Container plus nested sandbox namespace. | Uses host networking so loopback gateway endpoints work from the supervisor. |
| Podman | Rootless or single-machine deployments. | Container plus nested sandbox namespace. | Uses the Podman REST API, OCI image volumes, and CDI GPU devices when available. |
| Kubernetes | Cluster deployment through Helm. | Pod plus nested sandbox namespace. | Uses Kubernetes API objects, service accounts, secrets, PVC-backed workspace storage, and GPU resources. |
| VM | Experimental microVM isolation. | Per-sandbox libkrun VM. | Gateway spawns `openshell-driver-vm` as a subprocess over a private, state-local Unix socket. |
| VM | Experimental microVM isolation. | Per-sandbox libkrun VM. | Gateway spawns `openshell-driver-vm` as a subprocess over a private, state-local Unix socket. The VM driver caches a prepared `rootfs.ext4` per source image and copies it per sandbox, so guest ownership metadata lives inside the ext4 filesystem instead of host directory entries. |

VM runtime state paths are derived only from driver-validated sandbox IDs
matching `[A-Za-z0-9._-]{1,128}`. The gateway-owned VM driver socket uses a
Expand Down
17 changes: 13 additions & 4 deletions crates/openshell-driver-vm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

> Status: Experimental. The VM compute driver is under active development and the interface still has VM-specific plumbing that will be generalized.

Standalone libkrun-backed [`ComputeDriver`](../../proto/compute_driver.proto) for OpenShell. The gateway spawns this binary as a subprocess, talks to it over a Unix domain socket with the `openshell.compute.v1.ComputeDriver` gRPC surface, and lets it manage per-sandbox microVMs. The runtime (libkrun + libkrunfw + gvproxy) and the sandbox supervisor are embedded directly in the binary; each sandbox guest rootfs is derived from a configured container image at create time.
Standalone libkrun-backed [`ComputeDriver`](../../proto/compute_driver.proto) for OpenShell. The gateway spawns this binary as a subprocess, talks to it over a Unix domain socket with the `openshell.compute.v1.ComputeDriver` gRPC surface, and lets it manage per-sandbox microVMs. The runtime (libkrun + libkrunfw + gvproxy) and the sandbox supervisor are embedded directly in the binary; each sandbox boots from a copied ext4 root disk derived from the configured container image.

## How it fits together

Expand Down Expand Up @@ -42,7 +42,7 @@ By default `mise run gateway:vm`:
- Listens on plaintext HTTP at `127.0.0.1:18081`.
- Registers the CLI gateway `vm-dev` by writing `~/.config/openshell/gateways/vm-dev/metadata.json`. It does not modify the workspace `.env`.
- Persists the gateway SQLite DB under `.cache/gateway-vm/gateway.db`.
- Places the VM driver state (per-sandbox rootfs plus `run/compute-driver.sock`) under `/tmp/openshell-vm-driver-$USER-vm-dev/` so the AF_UNIX socket path stays under macOS `SUN_LEN`.
- Places the VM driver state (per-sandbox `rootfs.ext4`, image cache, and `run/compute-driver.sock`) under `/tmp/openshell-vm-driver-$USER-vm-dev/` so the AF_UNIX socket path stays under macOS `SUN_LEN`.
- Passes `--driver-dir $PWD/target/debug` so the freshly built `openshell-driver-vm` is used instead of an older installed copy from `~/.local/libexec/openshell`, `/usr/libexec/openshell`, or `/usr/local/libexec`.

For GPU passthrough (VFIO), pass `-- --gpu` and run with root privileges:
Expand Down Expand Up @@ -124,7 +124,7 @@ The gateway resolves `openshell-driver-vm` in this order: `--driver-dir`, conven
|---|---|---|---|
| `--drivers vm` | `OPENSHELL_DRIVERS` | `kubernetes` | Select the VM compute driver. |
| `--grpc-endpoint URL` | `OPENSHELL_GRPC_ENDPOINT` | — | Required. URL the sandbox guest dials to reach the gateway. Use `http://host.containers.internal:<port>` (or `host.docker.internal` / `host.openshell.internal`) so traffic flows through gvproxy's host-loopback NAT (HostIP `192.168.127.254` → host `127.0.0.1`). Loopback URLs like `http://127.0.0.1:<port>` are rewritten automatically by the driver. The bare gateway IP (`192.168.127.1`) only carries gvproxy's own services and will not reach host-bound ports. |
| `--vm-driver-state-dir DIR` | `OPENSHELL_VM_DRIVER_STATE_DIR` | `target/openshell-vm-driver` | Per-sandbox rootfs, console logs, image cache, and private `run/compute-driver.sock` UDS. |
| `--vm-driver-state-dir DIR` | `OPENSHELL_VM_DRIVER_STATE_DIR` | `target/openshell-vm-driver` | Per-sandbox root disk images, console logs, image cache, and private `run/compute-driver.sock` UDS. |
| `--driver-dir DIR` | `OPENSHELL_DRIVER_DIR` | unset | Override the directory searched for `openshell-driver-vm`. |
| `--vm-driver-vcpus N` | `OPENSHELL_VM_DRIVER_VCPUS` | `2` | vCPUs per sandbox. |
| `--vm-driver-mem-mib N` | `OPENSHELL_VM_DRIVER_MEM_MIB` | `2048` | Memory per sandbox, in MiB. |
Expand All @@ -145,7 +145,15 @@ The gateway is auto-registered by `mise run gateway:vm`. In another terminal:
./scripts/bin/openshell sandbox connect demo
```

First sandbox takes 10–30 seconds to boot (image fetch/prepare/cache + libkrun + guest init). If `--from` is omitted, the VM driver uses the gateway's configured default sandbox image. Without either `--from` or `--sandbox-image`, VM sandbox creation fails. Subsequent creates reuse the prepared sandbox rootfs.
First sandbox takes 10–30 seconds to boot (image fetch/prepare/cache + libkrun + guest init). If `--from` is omitted, the VM driver uses the gateway's configured default sandbox image. Without either `--from` or `--sandbox-image`, VM sandbox creation fails. Subsequent creates reuse the prepared image cache and copy its sparse root disk into the sandbox state directory before boot.

During rootfs preparation the VM driver exports or pulls the selected OCI image,
applies the OpenShell guest mutations, formats a sparse `rootfs.ext4`, and
caches it under `<state-dir>/images/<cache-id>/rootfs.ext4`. Each sandbox gets
its own copied `rootfs.ext4` under `<state-dir>/sandboxes/<id>/`. The host owns
only the image file; guest ownership such as `/sandbox` UID/GID metadata lives
inside the ext4 filesystem and is corrected by guest init before the supervisor
starts.

## Logs and debugging

Expand All @@ -162,6 +170,7 @@ The VM guest's serial console is appended to `<state-dir>/<sandbox-id>/console.l

- macOS on Apple Silicon, or Linux on aarch64/x86_64 with KVM
- Rust toolchain
- e2fsprogs (`mke2fs` or `mkfs.ext4`, plus `debugfs`) for root disk image creation and per-sandbox file injection
- Guest-supervisor cross-compile toolchain (needed on macOS, and on Linux when host arch ≠ guest arch):
- Matching rustup target: `rustup target add aarch64-unknown-linux-gnu` (or `x86_64-unknown-linux-gnu` for an amd64 guest)
- `cargo install --locked cargo-zigbuild` and `brew install zig` (or distro equivalent). `vm:supervisor` uses `cargo zigbuild` to cross-compile the in-VM `openshell-sandbox` supervisor binary.
Expand Down
7 changes: 7 additions & 0 deletions crates/openshell-driver-vm/runtime/kernel/openshell.kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,13 @@
#
# See also: check-vm-capabilities.sh for runtime verification.

# ── Root disk transport and filesystem ─────────────────────────────────
CONFIG_BLOCK=y
CONFIG_BLK_DEV=y
CONFIG_VIRTIO_BLK=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT2=y

# ── Network Namespaces (required for pod isolation) ─────────────────────
CONFIG_NET_NS=y
CONFIG_NAMESPACES=y
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,8 @@ setup_gpu() {
return 1
fi

# Stage GSP firmware from virtiofs to tmpfs to avoid slow FUSE reads
# Stage GSP firmware to tmpfs so module loading reads it from a stable
# early-boot path.
if [ -d /lib/firmware/nvidia ]; then
ts "staging GPU firmware to tmpfs"
mkdir -p /run/firmware/nvidia
Expand Down Expand Up @@ -273,6 +274,15 @@ setup_gpu() {
fi
}

setup_sandbox_workdir() {
mkdir -p /sandbox
if ! chown -R sandbox:sandbox /sandbox 2>/dev/null; then
chown -R 10001:10001 /sandbox
fi
chmod 0755 /sandbox
ts "prepared /sandbox ownership"
}

mount -t proc proc /proc 2>/dev/null &
mount -t sysfs sysfs /sys 2>/dev/null &
mount -t tmpfs tmpfs /tmp 2>/dev/null &
Expand All @@ -286,6 +296,8 @@ mount -t tmpfs tmpfs /dev/shm 2>/dev/null &
mount -t cgroup2 cgroup2 /sys/fs/cgroup 2>/dev/null &
wait

setup_sandbox_workdir

hostname openshell-sandbox-vm 2>/dev/null || true
ip link set lo up 2>/dev/null || true

Expand Down
Loading
Loading