diff --git a/.zenflow/tasks/new-task-7cad/plan.md b/.zenflow/tasks/new-task-7cad/plan.md
new file mode 100644
index 0000000..47f0e3e
--- /dev/null
+++ b/.zenflow/tasks/new-task-7cad/plan.md
@@ -0,0 +1,95 @@
+# Spec and build
+
+## Configuration
+- **Artifacts Path**: {@artifacts_path} → `.zenflow/tasks/{task_id}`
+
+---
+
+## Agent Instructions
+
+Ask the user questions when anything is unclear or needs their input. This includes:
+- Ambiguous or incomplete requirements
+- Technical decisions that affect architecture or user experience
+- Trade-offs that require business context
+
+Do not make assumptions on important decisions — get clarification first.
+
+---
+
+## Workflow Steps
+
+### [x] Step: Technical Specification
+<!-- chat-id: 1b6043e3-cc74-4de4-bcfa-6d46671c8fcc -->
+
+Assessed as **hard** — multiple interacting subsystems, new dependencies, cross-cutting header concerns.
+Full spec saved to `.zenflow/tasks/new-task-7cad/spec.md`.
+
+---
+
+### [x] Step: Fix cleanResponse and create browser inference engine
+<!-- chat-id: 21de9e58-71d9-4162-803c-27c158597d60 -->
+
+Create `lib/browser-engine.ts` (Transformers.js pipeline for in-browser WebGPU inference) and update `lib/clean-response.ts` to strip `<think>` blocks from model output.
+
+- [x] Install `@huggingface/transformers`
+- [x] Create `lib/browser-engine.ts` with singleton pipeline, lazy loading, progress callbacks, and status events
+- [x] Update `lib/clean-response.ts` to strip `<think>...</think>` blocks before other cleaning
+- [x] Exclude `relay/` from tsconfig (pre-existing build error)
+- [x] Verify: `npx tsc --noEmit && npm run lint && npm run build`
+
+---
+
+### [x] Step: Build WebContainer sandbox
+<!-- chat-id: cc61929e-9c30-4982-988d-8b378c61693a -->
+
+Create the in-browser sandbox using WebContainers to replace the Docker relay for demo use.
+
+- [x] Install `@webcontainer/api`
+- [x] Create `lib/webcontainer-sandbox.ts` (boot, exec, teardown singleton)
+- [x] Create `hooks/use-webcontainer.ts` (React hook: boot-on-first-run, exec, history tracking)
+- [x] Update `types/sandbox.d.ts` to make `auditId` optional
+- [x] Verify: `npx tsc --noEmit && npm run lint && npm run build`
+
+---
+
+### [x] Step: Wire up UI — mode selector, browser translate, sandbox execution
+<!-- chat-id: 8572affb-4f18-4853-9070-1e593f4975cc -->
+
+Integrate browser inference and WebContainer sandbox into the main UI.
+
+- [x] Add inference mode selector (Cloud / Browser / Auto) to `components/shell-session.tsx`
+- [x] Update `hooks/use-translate.ts` to accept mode parameter and call browser engine directly in browser mode
+- [x] Replace `useSandbox()` with `useWebContainer()` in shell-session for WebContainer execution
+- [x] Update `components/execution-output.tsx` for optional auditId and command history display
+- [x] Verify: `npx tsc --noEmit && npm run lint && npm run build`
+
+---
+
+### [x] Step: Configure COOP/COEP headers and CSP updates
+<!-- chat-id: d33e4a06-2d42-4dbb-b5c8-3bc822e2fc52 -->
+
+Add required headers for WebContainers (SharedArrayBuffer) without breaking Vercel Analytics.
+
+- [x] Update `next.config.ts` with COOP/COEP headers (try `credentialless` first)
+- [x] Update CSP `connect-src` for WebContainer and Transformers.js origins
+- [x] Test that Vercel Analytics still loads; fall back to route-specific headers if broken
+- [x] Verify: `npx tsc --noEmit && npm run lint && npm run build`
+
+---
+
+### [ ] Step: Deploy to Vercel and configure nl2shell.com domain
+<!-- chat-id: 380234c2-b996-4057-b2bd-f5ea13cc78d3 -->
+
+Push to main, verify auto-deploy, configure domain and env vars.
+
+- Merge feature branch to main
+- Set Vercel env vars: `HF_TOKEN`, `NEXT_PUBLIC_SANDBOX_ENABLED`
+- Configure nl2shell.com CNAME → cname.vercel-dns.com in Cloudflare (DNS-only)
+- Verify: `curl -I https://nl2shell.com` returns 200 with correct headers
+- Manual smoke test: Cloud mode, Browser mode, Sandbox execution
+
+---
+
+### [ ] Step: Write implementation report
+
+- Write `{@artifacts_path}/report.md` describing what was implemented, how it was tested, and challenges encountered
diff --git a/.zenflow/tasks/new-task-7cad/spec.md b/.zenflow/tasks/new-task-7cad/spec.md
new file mode 100644
index 0000000..55fda81
--- /dev/null
+++ b/.zenflow/tasks/new-task-7cad/spec.md
@@ -0,0 +1,271 @@
+# Technical Specification: NL2Shell Web — Browser Inference, Sandbox, and Deployment
+
+**Difficulty:** Hard
+**Rationale:** Multiple interacting subsystems (browser inference pipeline, WebContainer sandbox, COOP/COEP headers affecting third-party scripts, Vercel deployment), significant new code, and cross-cutting concerns (CSP headers, Vercel Analytics compatibility).
+
+---
+
+## Technical Context
+
+- **Framework:** Next.js 16.1.6 (App Router), React 19.2, TypeScript (strict)
+- **Styling:** Tailwind CSS 4, shadcn/ui components
+- **Current inference:** Cloud-only via `@gradio/client` → HuggingFace Space `AryaYT/nl2shell-demo`
+- **Current sandbox:** Docker relay server (`relay/`) — not deployed, requires Railway
+- **Deployment target:** Vercel (project linked in `.vercel/project.json`)
+- **Domain:** nl2shell.com (Cloudflare DNS)
+- **No test framework configured** — no test files exist
+
+---
+
+## Current State Analysis
+
+### What exists
+| Component | Status | Location |
+|-----------|--------|----------|
+| Cloud translation API | Working | `app/api/translate/route.ts` |
+| `useTranslate` hook | Working | `hooks/use-translate.ts` |
+| Docker relay sandbox | Implemented, not deployed | `relay/`, `hooks/use-sandbox.ts`, `app/api/execute/route.ts` |
+| `cleanResponse()` | Working for cloud mode | `lib/clean-response.ts` |
+| Safety checks | Working (22 patterns) | `lib/safety.ts` |
+| MCP server | Working | `app/api/mcp/route.ts` |
+| Vercel Analytics | Configured in layout | `app/layout.tsx` |
+
+### What's missing
+| Component | Status | Notes |
+|-----------|--------|-------|
+| `lib/browser-engine.ts` | **Does not exist** | SPEC says "fix" but file is absent — must create from scratch |
+| `<think>` block stripping | Missing in `cleanResponse()` | Current regex handles markdown fences, not `<think>` tags |
+| WebContainer sandbox | Not started | Replaces Docker relay for demo use case |
+| COOP/COEP headers | Not configured | Required by WebContainers for SharedArrayBuffer |
+| Mode selector UI | Not implemented | No Cloud/Browser/Auto toggle in current UI |
+| Vercel deployment | Not done | Domain not configured |
+
+---
+
+## Implementation Approach
+
+### Task 1: Browser Inference Engine (Create `lib/browser-engine.ts`)
+
+**New dependency:** `@huggingface/transformers` (Transformers.js v3)
+
+**Architecture:**
+- Singleton pipeline pattern — load model once, reuse across calls
+- `"use client"` module (WebGPU is browser-only)
+- Chat-template messages format for Qwen3.5 instruction model
+- System prompt matches the Gradio Space's prompt for consistency
+
+**Pipeline output shape** (Transformers.js `text-generation` with chat messages):
+```typescript
+// Returns: Array<{ generated_text: Array<{ role: string; content: string }> }>
+// The assistant's response is the last message in generated_text
+```
+
+**Key design decisions:**
+- Use `onnx-community/Qwen2.5-0.5B-Instruct` as initial model (smaller, faster for demo; SPEC's `Qwen3.5-0.8B-ONNX` can replace later when converted)
+- Model ID configurable via constant for easy swap
+- Progress callback for download/load status reporting to UI
+- Lazy loading — pipeline only created on first `generate()` call
+
+**Interface:**
+```typescript
+export interface BrowserEngineStatus {
+  stage: "idle" | "downloading" | "loading" | "ready" | "generating" | "error";
+  progress?: number; // 0-100 for download
+  error?: string;
+}
+
+export function generate(query: string): Promise<string>
+export function getStatus(): BrowserEngineStatus
+export function isReady(): boolean
+export function onStatusChange(cb: (s: BrowserEngineStatus) => void): () => void
+```
+
+### Task 1b: Fix `cleanResponse()` for `<think>` blocks
+
+The Qwen model family (especially instruction-tuned variants) often wraps reasoning in `<think>...</think>` tags before the actual answer. Current `cleanResponse()` only handles markdown fences.
+
+**Changes to `lib/clean-response.ts`:**
+- Add `<think>` block stripping as the FIRST operation (before markdown fence removal)
+- Regex: `/^<think>[\s\S]*?<\/think>\s*/` — matches `<think>` at start, any content (including newlines), closing tag, trailing whitespace
+- Handle edge cases: empty think block, no think block, think block with no content after it
+
+### Task 2: WebContainer Sandbox
+
+**New dependency:** `@webcontainer/api`
+
+**Architecture:**
+- `lib/webcontainer-sandbox.ts` — singleton WebContainer boot, exec, teardown
+- `hooks/use-webcontainer.ts` — React state management, boot-on-first-run, command history
+- Replaces Docker relay for the web demo (relay code stays for self-hosted/MCP use)
+
+**Key design decisions:**
+- WebContainer boots lazily on first "Run" click, not on page load (saves resources)
+- Command history persists in hook state (not localStorage — session-scoped)
+- `ExecResult` interface mirrors existing `ExecutionResult` type but without `auditId` (no server-side audit for browser sandbox)
+- The `useSandbox` hook is replaced by `useWebContainer` in `shell-session.tsx` when `NEXT_PUBLIC_SANDBOX_ENABLED` is `"webcontainer"` or `true`
+
+**WebContainer ExecResult:**
+```typescript
+export interface ExecResult {
+  stdout: string;
+  stderr: string;
+  exitCode: number;
+  durationMs: number;
+}
+```
+
+### Task 3: COOP/COEP Headers
+
+**Problem:** WebContainers require `Cross-Origin-Embedder-Policy` and `Cross-Origin-Opener-Policy` headers for SharedArrayBuffer. These headers can break:
+- Vercel Analytics (`@vercel/analytics`) — loads external script
+- Gradio client connections to `huggingface.co`
+- Any third-party iframe/script
+
+**Strategy:**
+1. First try `credentialless` instead of `require-corp` for COEP (less restrictive, Chrome 96+)
+2. If WebContainers work with `credentialless`, use that globally
+3. If not, add strict COOP/COEP only to a `/sandbox` route segment and keep main page without them
+4. Test Vercel Analytics compatibility in each configuration
+
+**Changes to `next.config.ts`:**
+- Add COOP/COEP headers (strategy TBD based on testing)
+- Update CSP `connect-src` to allow WebContainer origins if needed
+
+### Task 4: Wire Up UI
+
+**Changes to `components/shell-session.tsx`:**
+- Add mode selector (Cloud / Browser / Auto) — simple button group or dropdown
+- In Browser mode: use `generate()` from `lib/browser-engine.ts` instead of `POST /api/translate`
+- Auto mode: try browser first, fall back to cloud on error
+- Show model download progress bar when Browser mode first loads
+- Replace `useSandbox()` with `useWebContainer()` for execution
+
+**Changes to `components/execution-output.tsx`:**
+- Make `auditId` optional (WebContainer exec has no audit trail)
+- Support rendering command history (multiple exec results in sequence)
+
+**Changes to `hooks/use-translate.ts`:**
+- Accept a `mode` parameter or create a new `useBrowserTranslate` hook
+- Browser mode calls `generate()` directly (no fetch)
+
+### Task 5: Deployment
+
+- Merge to main, push, Vercel auto-deploys
+- Set env vars: `HF_TOKEN`, `NEXT_PUBLIC_SANDBOX_ENABLED=true`
+- Configure domain: nl2shell.com CNAME → cname.vercel-dns.com (Cloudflare, DNS-only)
+- Verify SSL, headers, all modes working
+
+### Task 6: Cloud Mode UX Improvements (if time permits)
+
+- Better loading states for 503 (Space sleeping)
+- Auto-retry on 503 with "Model is waking up..." message
+- Already partially handled in `route.ts` error responses
+
+---
+
+## Source Code Structure Changes
+
+### New files
+| File | Purpose |
+|------|---------|
+| `lib/browser-engine.ts` | Transformers.js pipeline, model loading, text generation |
+| `lib/webcontainer-sandbox.ts` | WebContainer boot, exec, teardown singleton |
+| `hooks/use-webcontainer.ts` | React hook for sandbox lifecycle + history |
+
+### Modified files
+| File | Changes |
+|------|---------|
+| `lib/clean-response.ts` | Add `<think>` block stripping |
+| `components/shell-session.tsx` | Mode selector, browser inference path, WebContainer sandbox |
+| `components/execution-output.tsx` | Optional `auditId`, history rendering |
+| `hooks/use-translate.ts` | Support browser mode (or new hook) |
+| `next.config.ts` | COOP/COEP headers, CSP updates |
+| `package.json` | Add `@huggingface/transformers`, `@webcontainer/api` |
+| `types/sandbox.d.ts` | Add `WebContainerExecResult` or make `auditId` optional |
+
+### Unchanged (no modifications needed)
+| File | Reason |
+|------|--------|
+| `app/api/translate/route.ts` | Cloud mode stays as-is |
+| `app/api/execute/route.ts` | Docker relay stays for future self-hosted use |
+| `relay/*` | Docker relay untouched |
+| `lib/safety.ts` | Already handles both modes' output |
+| `app/layout.tsx` | Vercel Analytics stays; COOP/COEP handled in next.config.ts |
+
+---
+
+## Interface Changes
+
+### `ExecutionResult` type update
+```typescript
+// types/sandbox.d.ts
+export interface ExecutionResult {
+  stdout: string;
+  stderr: string;
+  exitCode: number;
+  durationMs: number;
+  auditId?: string; // Optional — absent for WebContainer exec
+}
+```
+
+### New `InferenceMode` type
+```typescript
+type InferenceMode = "cloud" | "browser" | "auto";
+```
+
+### `useTranslate` hook extension
+```typescript
+// Option A: mode parameter
+export function useTranslate(mode?: InferenceMode)
+
+// Option B: separate hook for browser (cleaner separation)
+export function useBrowserTranslate()
+```
+
+Decision: **Option A** — single hook with mode parameter, keeps `shell-session.tsx` simpler.
+
+---
+
+## Verification Approach
+
+After each implementation step:
+```bash
+npx tsc --noEmit       # Zero type errors
+npm run lint           # Zero lint errors
+npm run build          # Clean production build
+```
+
+Manual testing (dev server):
+1. Cloud mode: query → shell command (existing flow, regression check)
+2. Browser mode: model downloads → query → shell command (no `<think>` in output)
+3. Sandbox: boot WebContainer → execute → see output → filesystem persists
+4. Mode selector: all three modes switch correctly
+5. Danger warning: `rm -rf /` shows red badge in all modes
+
+Production verification:
+```bash
+curl -I https://nl2shell.com  # 200 OK, correct headers
+```
+
+---
+
+## Risk Assessment
+
+| Risk | Impact | Mitigation |
+|------|--------|------------|
+| COOP/COEP breaks Vercel Analytics | Medium | Test `credentialless` first; fallback to route-specific headers |
+| WebContainer boot slow (>5s) | Low | Lazy boot on first "Run", show spinner |
+| Transformers.js model too large | Medium | Start with 0.5B model (~300MB ONNX); upgrade to fine-tuned later |
+| CSP blocks WebContainer origins | Medium | Add required origins to `connect-src` incrementally |
+| HF Space sleeping on first visit | Low | Already handled with 503 retry messages in translate API |
+
+---
+
+## Dependencies to Install
+
+```bash
+npm install @huggingface/transformers @webcontainer/api
+```
+
+- `@huggingface/transformers` — Transformers.js v3, ONNX Runtime Web, WebGPU inference
+- `@webcontainer/api` — StackBlitz WebContainers for in-browser command execution
diff --git a/SPEC.md b/SPEC.md
deleted file mode 100644
index cc75f1f..0000000
--- a/SPEC.md
+++ /dev/null
@@ -1,494 +0,0 @@
-# SPEC: NL2Shell Web — Browser Inference Fix, Sandbox, and Deployment
-
-**Project:** nl2shell-web  
-**Branch:** `feat/webllm-browser-inference`  
-**Repository:** github.com/nl2shell/nl2shell-web  
-**Date:** 2026-04-04  
-**Goal:** Ship a working NL2Shell web app with browser-side inference, persistent sandbox execution, and production deployment at nl2shell.com
-
----
-
-## Context
-
-NL2Shell translates natural language to shell commands using a fine-tuned Qwen3.5-0.8B model. The web app has two inference modes:
-
-- **Cloud:** Calls a HuggingFace Gradio Space (`AryaYT/nl2shell-demo`) via `/api/translate`
-- **Browser:** Runs a Qwen3.5 ONNX model locally via WebGPU using `@huggingface/transformers`
-
-The Browser mode was recently added but has bugs. The sandbox (command execution) uses a Docker relay server that isn't deployed yet. The app needs to be deployed to nl2shell.com via Vercel.
-
----
-
-## Current Issues
-
-1. **Browser inference returns empty response** — The `lib/browser-engine.ts` Transformers.js pipeline output parsing is incorrect. The model generates text but `cleanResponse()` strips everything, leaving empty output. The `<think>` block stripping and the output format from `pipeline("text-generation")` need debugging.
-
-2. **No working sandbox** — The relay server requires Docker and a separate deployment (Railway). For the demo use case, we need a lightweight sandbox where users can execute generated commands and see results, with filesystem persistence between commands.
-
-3. **Not deployed** — The app runs locally but isn't deployed to nl2shell.com yet.
-
----
-
-## Task Breakdown
-
-### Task 1: Fix Browser Inference Output Parsing
-
-**Priority:** Critical  
-**Files:** `lib/browser-engine.ts`, `lib/clean-response.ts`  
-**Estimated effort:** Small
-
-**Problem:** The `generate()` function in `lib/browser-engine.ts` calls `pipelineInstance(messages, ...)` but the return format from Transformers.js `text-generation` pipeline varies. The current code tries:
-```typescript
-const raw = result[0]?.generated_text?.at(-1)?.content ?? result[0]?.generated_text ?? "";
-```
-This may not match the actual output shape. Additionally, `cleanResponse()` may be too aggressive — the `<think>` block regex could strip the entire output if the model puts the command inside or after a think block.
-
-**Steps:**
-1. Read `lib/browser-engine.ts` fully to understand the current `generate()` function
-2. Add `console.log(JSON.stringify(result, null, 2))` temporarily inside `generate()` to see the raw pipeline output shape
-3. Run the dev server (`npm run dev`) and test with Browser mode — observe the console output
-4. Fix the output extraction based on actual shape. The Transformers.js `text-generation` pipeline with chat messages typically returns:
-   ```javascript
-   [{ generated_text: [
-     { role: "system", content: "..." },
-     { role: "user", content: "..." },
-     { role: "assistant", content: "THE COMMAND HERE" }
-   ]}]
-   ```
-   So the extraction should be:
-   ```typescript
-   const messages = result[0]?.generated_text;
-   const lastMsg = Array.isArray(messages) ? messages.at(-1) : null;
-   const raw = typeof lastMsg === "object" && lastMsg?.content 
-     ? lastMsg.content 
-     : typeof messages === "string" 
-       ? messages 
-       : "";
-   ```
-5. Update `lib/clean-response.ts` — ensure the `<think>` regex handles edge cases:
-   - Empty think block: `<think>\n</think>` followed by the command
-   - Think block with content followed by command on next line
-   - No think block at all (just the command)
-6. Remove the `console.log` debug line
-7. Test with multiple queries: "list files", "create a branch called feature-auth", "find python files modified today"
-
-**Verification:**
-```bash
-npx tsc --noEmit && npm run lint && npm run build
-```
-Then manually test in Chrome with Browser mode — each query should return a clean shell command.
-
-**Acceptance criteria:**
-- Browser mode returns valid shell commands (not empty, not `<think>` blocks)
-- Cloud mode still works unchanged
-- All build checks pass
-
----
-
-### Task 2: Build In-Browser Sandbox with WebContainers
-
-**Priority:** High  
-**Files to create:**
-- `lib/webcontainer-sandbox.ts` — WebContainer boot, exec, file ops
-- `hooks/use-webcontainer.ts` — React hook for sandbox lifecycle
-- `components/sandbox-terminal.tsx` — Terminal-like output display
-
-**Files to modify:**
-- `components/shell-session.tsx` — Wire sandbox execution
-- `components/execution-output.tsx` — Update to show persistent session
-- `package.json` — Add `@webcontainer/api`
-
-**Context:** Instead of requiring a Docker relay server, use WebContainers (StackBlitz) to run commands entirely in the browser. This eliminates infrastructure costs and works offline. The sandbox persists state between commands — users can create files, then list them, then modify them.
-
-**Steps:**
-
-#### 2a. Install WebContainers
-```bash
-cd /Users/aryateja/Projects/nl2shell-org/nl2shell-web
-npm install @webcontainer/api
-```
-
-#### 2b. Create `lib/webcontainer-sandbox.ts`
-```typescript
-"use client";
-
-import { WebContainer } from "@webcontainer/api";
-
-let container: WebContainer | null = null;
-let bootPromise: Promise<WebContainer> | null = null;
-
-export async function bootSandbox(): Promise<void> {
-  if (container) return;
-  if (bootPromise) {
-    await bootPromise;
-    return;
-  }
-  bootPromise = WebContainer.boot();
-  try {
-    container = await bootPromise;
-    // Seed with a basic workspace
-    await container.mount({
-      workspace: {
-        directory: {},
-      },
-    });
-  } catch (err) {
-    bootPromise = null;
-    throw err;
-  }
-}
-
-export function isSandboxReady(): boolean {
-  return container !== null;
-}
-
-export interface ExecResult {
-  stdout: string;
-  stderr: string;
-  exitCode: number;
-  durationMs: number;
-}
-
-export async function execCommand(command: string): Promise<ExecResult> {
-  if (!container) throw new Error("Sandbox not booted");
-
-  const start = performance.now();
-  const process = await container.spawn("bash", ["-c", command], {
-    cwd: "/workspace",
-  });
-
-  let stdout = "";
-  let stderr = "";
-
-  process.output.pipeTo(
-    new WritableStream({
-      write(chunk) {
-        stdout += chunk;
-      },
-    })
-  );
-
-  // WebContainers merge stderr into output in some cases
-  const exitCode = await process.exit;
-  const durationMs = Math.round(performance.now() - start);
-
-  return { stdout: stdout.trim(), stderr: stderr.trim(), exitCode, durationMs };
-}
-
-export async function teardownSandbox(): Promise<void> {
-  if (container) {
-    container.teardown();
-    container = null;
-    bootPromise = null;
-  }
-}
-```
-
-#### 2c. Create `hooks/use-webcontainer.ts`
-```typescript
-"use client";
-
-import { useCallback, useState } from "react";
-
-interface SandboxState {
-  isReady: boolean;
-  isBooting: boolean;
-  isExecuting: boolean;
-  output: { stdout: string; stderr: string; exitCode: number } | null;
-  error: string | null;
-  history: Array<{
-    command: string;
-    stdout: string;
-    exitCode: number;
-    timestamp: number;
-  }>;
-}
-
-export function useWebContainer() {
-  const [state, setState] = useState<SandboxState>({
-    isReady: false,
-    isBooting: false,
-    isExecuting: false,
-    output: null,
-    error: null,
-    history: [],
-  });
-
-  const sandboxRef = useRef<typeof import("@/lib/webcontainer-sandbox") | null>(null);
-
-  const getSandbox = useCallback(async () => {
-    if (!sandboxRef.current) {
-      sandboxRef.current = await import("@/lib/webcontainer-sandbox");
-    }
-    return sandboxRef.current;
-  }, []);
-
-  const boot = useCallback(async () => {
-    setState((s) => ({ ...s, isBooting: true, error: null }));
-    try {
-      const sb = await getSandbox();
-      await sb.bootSandbox();
-      setState((s) => ({ ...s, isReady: true, isBooting: false }));
-    } catch (err) {
-      setState((s) => ({
-        ...s,
-        isBooting: false,
-        error: err instanceof Error ? err.message : "Failed to boot sandbox",
-      }));
-    }
-  }, [getSandbox]);
-
-  const execute = useCallback(
-    async (command: string) => {
-      setState((s) => ({ ...s, isExecuting: true, output: null, error: null }));
-      try {
-        const sb = await getSandbox();
-        if (!sb.isSandboxReady()) await sb.bootSandbox();
-        const result = await sb.execCommand(command);
-        setState((s) => ({
-          ...s,
-          isExecuting: false,
-          output: result,
-          history: [
-            ...s.history,
-            {
-              command,
-              stdout: result.stdout,
-              exitCode: result.exitCode,
-              timestamp: Date.now(),
-            },
-          ],
-        }));
-      } catch (err) {
-        setState((s) => ({
-          ...s,
-          isExecuting: false,
-          error: err instanceof Error ? err.message : "Execution failed",
-        }));
-      }
-    },
-    [getSandbox]
-  );
-
-  return { ...state, boot, execute };
-}
-```
-
-#### 2d. Update `components/shell-session.tsx`
-- Replace `useSandbox()` with `useWebContainer()` (or make it a fallback)
-- Auto-boot sandbox when user clicks "Run" for the first time
-- Show sandbox history (all commands + outputs in sequence)
-- The `NEXT_PUBLIC_SANDBOX_ENABLED` env var should default to `true` when using WebContainers (no relay needed)
-
-#### 2e. Update `next.config.ts`
-Add required headers for WebContainers (SharedArrayBuffer):
-```typescript
-{
-  key: "Cross-Origin-Embedder-Policy",
-  value: "require-corp",
-},
-{
-  key: "Cross-Origin-Opener-Policy", 
-  value: "same-origin",
-},
-```
-**IMPORTANT:** These headers may break Vercel Analytics and other third-party scripts. Test carefully. If they break, add them only to specific routes or make sandbox a separate page.
-
-#### 2f. Test the sandbox flow
-1. User types "create 5 python files named app.py, utils.py, config.py, test.py, main.py"
-2. Model generates: `touch app.py utils.py config.py test.py main.py`
-3. User clicks "Run" — sandbox boots, executes, shows empty output (success)
-4. User types "list all files sorted by size"
-5. Model generates: `ls -lS`
-6. User clicks "Run" — sandbox executes, shows the 5 files
-7. Files persist because WebContainer is still alive
-
-**Verification:**
-```bash
-npx tsc --noEmit && npm run lint && npm run build
-```
-
-**Acceptance criteria:**
-- Sandbox boots in <3 seconds
-- Commands execute and show stdout/stderr
-- Filesystem persists between commands
-- History shows all previous commands + outputs
-- All build checks pass
-
----
-
-### Task 3: Update COOP/COEP Headers Strategy
-
-**Priority:** Medium  
-**Files:** `next.config.ts`
-
-**Problem:** WebContainers require `Cross-Origin-Embedder-Policy: require-corp` and `Cross-Origin-Opener-Policy: same-origin`. These headers may break third-party scripts (Vercel Analytics, Supabase, Gradio client).
-
-**Steps:**
-1. Test if adding COOP/COEP headers to ALL routes breaks Vercel Analytics
-2. If it does, create a separate route `/sandbox` that has the headers, and keep the main page without them
-3. Alternatively, use `credentialless` instead of `require-corp` for COEP (less restrictive)
-4. If WebContainers work without COOP/COEP in Chrome (they sometimes do), skip the headers entirely
-
-**Verification:** Load the page, check that Vercel Analytics loads, and that WebContainers boot.
-
----
-
-### Task 4: Deploy to Vercel (nl2shell.com)
-
-**Priority:** High  
-**Files:** None (git + Vercel CLI operations)
-
-**Steps:**
-
-#### 4a. Merge feature branch to main
-```bash
-cd /Users/aryateja/Projects/nl2shell-org/nl2shell-web
-git add -A
-git commit -m "feat: add browser inference (Transformers.js) + WebContainer sandbox"
-git push origin feat/webllm-browser-inference
-# Create PR via GitHub CLI
-gh pr create --title "feat: browser inference + sandbox" --body "..."
-gh pr merge --squash
-```
-
-#### 4b. Verify Vercel auto-deployment
-The `.vercel/project.json` links to Vercel project `prj_0mLK6SAEeGdDgSk1zPMhCXguX3RP`. Pushing to main should trigger auto-deploy.
-
-Check deployment status:
-```bash
-npx vercel ls
-```
-
-#### 4c. Set environment variables in Vercel
-Go to Vercel dashboard or use CLI:
-```bash
-npx vercel env add HF_TOKEN production
-npx vercel env add NEXT_PUBLIC_SANDBOX_ENABLED production  # Set to "true"
-```
-
-#### 4d. Configure domain (nl2shell.com)
-In Vercel dashboard:
-1. Go to project settings > Domains
-2. Add `nl2shell.com` and `www.nl2shell.com`
-3. Vercel will show required DNS records
-
-In Cloudflare dashboard for nl2shell.com:
-1. Add CNAME record: `@` -> `cname.vercel-dns.com` (DNS only, NOT proxied)
-2. Add CNAME record: `www` -> `cname.vercel-dns.com` (DNS only, NOT proxied)
-3. Wait for DNS propagation (usually <5 minutes)
-
-#### 4e. Verify production
-```bash
-curl -I https://nl2shell.com
-# Should return 200 OK with correct headers
-```
-
-Open https://nl2shell.com in browser:
-1. Cloud mode works (generates commands)
-2. Browser mode loads model and generates commands
-3. Sandbox executes commands (if WebContainers work with COOP/COEP)
-
-**Acceptance criteria:**
-- https://nl2shell.com loads and shows the NL2Shell interface
-- Cloud mode generates shell commands
-- Browser mode loads the ONNX model and generates commands
-- SSL certificate is valid
-- All security headers present
-
----
-
-### Task 5: Fix Cloud Mode Performance
-
-**Priority:** Medium  
-**Files:** `app/api/translate/route.ts`
-
-**Problem:** The HuggingFace Gradio Space (`AryaYT/nl2shell-demo`) runs on free CPU tier and takes 30-40 seconds per request. It also sleeps after inactivity.
-
-**Steps:**
-1. Check if the Gradio space is awake: `curl https://huggingface.co/spaces/AryaYT/nl2shell-demo`
-2. If it's sleeping, consider upgrading to a paid GPU tier or using a different backend
-3. For now, improve the UX by showing better loading states:
-   - Show "Model is waking up..." when 503 is returned
-   - Show estimated wait time
-   - Auto-retry after 5 seconds on 503
-4. Consider adding a FastAPI backend as an alternative to Gradio (faster cold starts, deployable on Vercel)
-
-**Verification:** Cloud mode should respond in <15 seconds for warm requests.
-
----
-
-### Task 6: Convert Fine-Tuned Model to ONNX (Follow-up)
-
-**Priority:** Low (post-launch)  
-**Files:** New Python project or script
-
-**Context:** The current browser mode uses the BASE Qwen3.5-0.8B-ONNX model (from onnx-community), not the fine-tuned NL2Shell model. The fine-tuned model (`AryaYT/nl2shell-0.8b`) produces much better results for shell commands.
-
-**Blocker:** ONNX export requires `transformers >= 5.x` (git main) + `optimum` from git main. These have dependency conflicts. The Transformers.js converter script (`scripts/convert.py` in the transformers.js repo) may handle this better.
-
-**Steps:**
-1. Clone the Transformers.js repo: `git clone https://github.com/huggingface/transformers.js`
-2. Use their conversion script:
-   ```bash
-   python scripts/convert.py --model_id AryaYT/nl2shell-0.8b --quantize --task text-generation
-   ```
-3. If conversion succeeds, upload to HuggingFace as `AryaYT/nl2shell-0.8b-ONNX`
-4. Update `lib/browser-engine.ts` to point to the fine-tuned ONNX model
-5. Test quality: the fine-tuned model should output clean commands without `<think>` blocks
-
-**Alternative:** If ONNX conversion fails for Qwen3.5 architecture, wait for MLC-LLM to add official Qwen3.5 support (tracked at mlc-ai/web-llm#778).
-
----
-
-## Architecture Diagram
-
-```
-                    nl2shell.com (Vercel)
-                           |
-            +--------------+--------------+
-            |              |              |
-     [Cloud Mode]   [Browser Mode]  [Sandbox]
-            |              |              |
-   /api/translate    Transformers.js  WebContainers
-            |         (WebGPU)       (in-browser)
-            |              |              |
-    HuggingFace      ONNX Model     bash, node
-    Gradio Space     (IndexedDB      filesystem
-    (Qwen3.5)        cached)         persists
-```
-
-## File Change Summary
-
-| File | Action | Task |
-|------|--------|------|
-| `lib/browser-engine.ts` | Modify | Task 1 |
-| `lib/clean-response.ts` | Modify | Task 1 |
-| `lib/webcontainer-sandbox.ts` | Create | Task 2 |
-| `hooks/use-webcontainer.ts` | Create | Task 2 |
-| `components/shell-session.tsx` | Modify | Task 2 |
-| `components/execution-output.tsx` | Modify | Task 2 |
-| `next.config.ts` | Modify | Task 3 |
-| `package.json` | Modify | Task 2 |
-
-## Build Verification (Run After Every Task)
-
-```bash
-cd /Users/aryateja/Projects/nl2shell-org/nl2shell-web
-npx tsc --noEmit       # Zero type errors
-npm run lint           # Zero lint errors
-npm run build          # Clean production build
-```
-
-## Testing Checklist
-
-- [ ] Cloud mode: type query, get shell command, <15s response
-- [ ] Browser mode: load model (~400MB), type query, get shell command
-- [ ] Browser mode: no `<think>` blocks in output
-- [ ] Sandbox: boot WebContainer, execute command, see output
-- [ ] Sandbox: create files, then list them (persistence between commands)
-- [ ] Sandbox: history shows all previous commands
-- [ ] Mode selector: Cloud/Browser/Auto all work correctly
-- [ ] Danger warning: `rm -rf /` shows red warning badge
-- [ ] Mobile: Cloud mode works, Browser/Sandbox disabled gracefully
-- [ ] Production: nl2shell.com loads, SSL valid, all modes work
diff --git a/components/execution-output.tsx b/components/execution-output.tsx
index d245776..80fcfe5 100644
--- a/components/execution-output.tsx
+++ b/components/execution-output.tsx
@@ -2,12 +2,24 @@
 
 import type { ExecutionResult } from "@/types/sandbox";
 
+interface HistoryEntry {
+  command: string;
+  stdout: string;
+  exitCode: number;
+  timestamp: number;
+}
+
 interface ExecutionOutputProps {
   result: ExecutionResult;
   command: string;
+  history?: HistoryEntry[];
 }
 
-export function ExecutionOutput({ result, command }: ExecutionOutputProps) {
+export function ExecutionOutput({
+  result,
+  command,
+  history,
+}: ExecutionOutputProps) {
   const hasStdout = result.stdout.trim().length > 0;
   const hasStderr = result.stderr.trim().length > 0;
   const isSuccess = result.exitCode === 0;
@@ -39,8 +51,26 @@ export function ExecutionOutput({ result, command }: ExecutionOutputProps) {
         </div>
 
         {/* Output */}
-        <div className="bg-[#0d1117] p-4 text-sm font-mono leading-relaxed overflow-x-auto">
-          {/* Command echo */}
+        <div className="bg-[#0d1117] p-4 text-sm font-mono leading-relaxed overflow-x-auto max-h-[400px] overflow-y-auto">
+          {/* Previous commands (history) */}
+          {history && history.length > 0 && (
+            <div className="opacity-50 mb-3 pb-3 border-b border-[var(--terminal-border)]/30">
+              {history.map((entry) => (
+                <div key={entry.timestamp} className="mb-2 last:mb-0">
+                  <div className="text-muted-foreground/40">
+                    <span className="text-[#2ea44f]/50">$</span> {entry.command}
+                  </div>
+                  {entry.stdout.trim() && (
+                    <pre className="text-[#e6edf3]/50 whitespace-pre-wrap break-all">
+                      {entry.stdout}
+                    </pre>
+                  )}
+                </div>
+              ))}
+            </div>
+          )}
+
+          {/* Current command echo */}
           <div className="text-muted-foreground/40 mb-2">
             <span className="text-[#2ea44f]">$</span> {command}
           </div>
@@ -69,9 +99,11 @@ export function ExecutionOutput({ result, command }: ExecutionOutputProps) {
       </div>
 
       {/* Audit trail reference */}
-      <p className="text-[10px] text-muted-foreground/30 font-mono mt-1.5 text-right">
-        audit: {result.auditId}
-      </p>
+      {result.auditId && (
+        <p className="text-[10px] text-muted-foreground/30 font-mono mt-1.5 text-right">
+          audit: {result.auditId}
+        </p>
+      )}
     </div>
   );
 }
diff --git a/components/inference-mode-selector.tsx b/components/inference-mode-selector.tsx
deleted file mode 100644
index 3c6cb76..0000000
--- a/components/inference-mode-selector.tsx
+++ /dev/null
@@ -1,72 +0,0 @@
-"use client";
-
-import { Cloud, Cpu, Zap } from "lucide-react";
-import { cn } from "@/lib/utils";
-import type { InferenceMode } from "@/hooks/use-inference";
-import type { ModelStatus } from "@/hooks/use-local-inference";
-
-interface InferenceModeSelectorProps {
-  mode: InferenceMode;
-  onModeChange: (mode: InferenceMode) => void;
-  isWebGPUAvailable: boolean;
-  modelStatus: ModelStatus;
-}
-
-const modes: {
-  value: InferenceMode;
-  label: string;
-  icon: typeof Cloud;
-}[] = [
-  { value: "cloud", label: "Cloud", icon: Cloud },
-  { value: "browser", label: "Browser", icon: Cpu },
-  { value: "auto", label: "Auto", icon: Zap },
-];
-
-export function InferenceModeSelector({
-  mode,
-  onModeChange,
-  isWebGPUAvailable,
-  modelStatus,
-}: InferenceModeSelectorProps) {
-  return (
-    <div className="inline-flex items-center rounded-lg border border-border/40 bg-background/50 p-0.5">
-      {modes.map(({ value, label, icon: Icon }) => {
-        const isActive = mode === value;
-        const isDisabled =
-          (value === "browser" || value === "auto") && !isWebGPUAvailable;
-
-        return (
-          <button
-            key={value}
-            onClick={() => !isDisabled && onModeChange(value)}
-            disabled={isDisabled}
-            title={
-              isDisabled
-                ? "WebGPU not available in this browser"
-                : value === "browser" && modelStatus === "ready"
-                  ? "Model loaded — inference runs locally"
-                  : value === "browser"
-                    ? "Run model in your browser via WebGPU"
-                    : value === "auto"
-                      ? "Use browser if model loaded, otherwise cloud"
-                      : "Use cloud API (HuggingFace)"
-            }
-            className={cn(
-              "flex items-center gap-1 rounded-md px-2 py-1 text-[11px] font-medium transition-all",
-              isActive
-                ? "bg-foreground/10 text-foreground shadow-sm"
-                : "text-muted-foreground/60 hover:text-muted-foreground",
-              isDisabled && "opacity-30 cursor-not-allowed",
-            )}
-          >
-            <Icon className="size-3" />
-            <span>{label}</span>
-            {value === "browser" && modelStatus === "ready" && (
-              <span className="size-1.5 rounded-full bg-[#28c840]" />
-            )}
-          </button>
-        );
-      })}
-    </div>
-  );
-}
diff --git a/components/model-loader.tsx b/components/model-loader.tsx
deleted file mode 100644
index ea25f94..0000000
--- a/components/model-loader.tsx
+++ /dev/null
@@ -1,47 +0,0 @@
-"use client";
-
-import { cn } from "@/lib/utils";
-import { motion } from "motion/react";
-
-interface ModelLoaderProps {
-  progress: number;
-  progressText: string;
-  className?: string;
-}
-
-export function ModelLoader({
-  progress,
-  progressText,
-  className,
-}: ModelLoaderProps) {
-  return (
-    <div
-      className={cn("space-y-2 py-4", className)}
-      role="status"
-      aria-live="polite"
-      aria-label="Loading model"
-    >
-      {/* Progress bar */}
-      <div className="h-1.5 w-full rounded-full bg-border/30 overflow-hidden">
-        <motion.div
-          className="h-full rounded-full bg-gradient-to-r from-[#2ea44f] to-[#28c840]"
-          initial={{ width: 0 }}
-          animate={{ width: `${progress}%` }}
-          transition={{ duration: 0.3, ease: "easeOut" }}
-        />
-      </div>
-
-      {/* Status text */}
-      <div className="flex items-center justify-between text-[11px] font-mono text-muted-foreground/60">
-        <span className="truncate max-w-[70%]">{progressText}</span>
-        <span>{progress}%</span>
-      </div>
-
-      {progress < 10 && (
-        <p className="text-[10px] text-muted-foreground/40">
-          ~450MB download, cached for future visits
-        </p>
-      )}
-    </div>
-  );
-}
diff --git a/components/shell-session.tsx b/components/shell-session.tsx
index 5337c37..7095990 100644
--- a/components/shell-session.tsx
+++ b/components/shell-session.tsx
@@ -1,8 +1,7 @@
 "use client";
 
 import { useCallback, useState } from "react";
-import dynamic from "next/dynamic";
-import { Loader2, Terminal } from "lucide-react";
+import { Cloud, Loader2, Monitor, Terminal, Zap } from "lucide-react";
 import { Card, CardContent } from "@/components/ui/card";
 import { Button } from "@/components/ui/button";
 import { Textarea } from "@/components/ui/textarea";
@@ -11,17 +10,16 @@ import { CommandOutput } from "@/components/command-output";
 import { ExecutionOutput } from "@/components/execution-output";
 import { ExamplePrompts } from "@/components/example-prompts";
 import { AILoader } from "@/components/ai-loader";
-import { ModelLoader } from "@/components/model-loader";
-import { useInference } from "@/hooks/use-inference";
-import { useSandbox } from "@/hooks/use-sandbox";
+import { useTranslate, type InferenceMode } from "@/hooks/use-translate";
+import { useWebContainer } from "@/hooks/use-webcontainer";
 
-// Dynamic import with ssr: false to avoid hydration mismatch from WebGPU detection
-const InferenceModeSelector = dynamic(
-  () => import("@/components/inference-mode-selector").then((m) => m.InferenceModeSelector),
-  { ssr: false },
-);
+const SANDBOX_ENABLED = process.env.NEXT_PUBLIC_SANDBOX_ENABLED !== "false";
 
-const SANDBOX_ENABLED = process.env.NEXT_PUBLIC_SANDBOX_ENABLED === "true";
+const MODES: { value: InferenceMode; label: string; icon: typeof Cloud }[] = [
+  { value: "cloud", label: "Cloud", icon: Cloud },
+  { value: "browser", label: "Browser", icon: Monitor },
+  { value: "auto", label: "Auto", icon: Zap },
+];
 
 interface HistoryEntry {
   query: string;
@@ -34,21 +32,10 @@ export function ShellSession() {
   const [input, setInput] = useState("");
   const [lastQuery, setLastQuery] = useState("");
   const [history, setHistory] = useState<HistoryEntry[]>([]);
-  const {
-    result,
-    isLoading,
-    error,
-    translate,
-    reset,
-    mode,
-    setMode,
-    isWebGPUAvailable,
-    modelStatus,
-    loadProgress,
-    loadProgressText,
-    inferenceSource,
-  } = useInference();
-  const sandbox = useSandbox();
+  const [mode, setMode] = useState<InferenceMode>("cloud");
+  const { result, isLoading, error, browserStatus, translate, reset } =
+    useTranslate(mode);
+  const sandbox = useWebContainer();
 
   const handleSubmit = useCallback(() => {
     const trimmed = input.trim();
@@ -64,7 +51,7 @@ export function ShellSession() {
       setLastQuery(text.trim());
       translate(text);
     },
-    [translate]
+    [translate],
   );
 
   const handleExampleSelect = useCallback(
@@ -73,15 +60,22 @@ export function ShellSession() {
       setLastQuery(example.trim());
       translate(example);
     },
-    [translate]
+    [translate],
   );
 
   const handleClear = useCallback(() => {
     if (result && lastQuery) {
-      setHistory((prev) => [
-        { query: lastQuery, command: result.command, meta: result.meta, timestamp: Date.now() },
-        ...prev,
-      ].slice(0, 20));
+      setHistory((prev) =>
+        [
+          {
+            query: lastQuery,
+            command: result.command,
+            meta: result.meta,
+            timestamp: Date.now(),
+          },
+          ...prev,
+        ].slice(0, 20),
+      );
     }
     setInput("");
     setLastQuery("");
@@ -95,18 +89,46 @@ export function ShellSession() {
     }
   };
 
+  const showBrowserProgress =
+    isLoading &&
+    mode !== "cloud" &&
+    (browserStatus.stage === "downloading" ||
+      browserStatus.stage === "loading");
+
   return (
     <div className="space-y-6">
       {/* Input card */}
       <Card className="glass-card border-border/40">
         <CardContent className="space-y-4">
           <div className="space-y-2">
-            <label
-              htmlFor="nl-input"
-              className="text-sm font-medium text-foreground/80"
-            >
-              Describe what you want to do
-            </label>
+            <div className="flex items-center justify-between">
+              <label
+                htmlFor="nl-input"
+                className="text-sm font-medium text-foreground/80"
+              >
+                Describe what you want to do
+              </label>
+
+              {/* Mode selector */}
+              <div className="flex items-center gap-0.5 p-0.5 rounded-lg bg-background/50 border border-border/30">
+                {MODES.map(({ value, label, icon: Icon }) => (
+                  <button
+                    key={value}
+                    onClick={() => setMode(value)}
+                    className={`flex items-center gap-1.5 px-2.5 py-1 text-[11px] font-mono rounded-md transition-colors ${
+                      mode === value
+                        ? "bg-[#2ea44f]/15 text-[#2ea44f]"
+                        : "text-muted-foreground/50 hover:text-muted-foreground/80"
+                    }`}
+                    aria-pressed={mode === value}
+                  >
+                    <Icon className="size-3" />
+                    {label}
+                  </button>
+                ))}
+              </div>
+            </div>
+
             <Textarea
               id="nl-input"
               value={input}
@@ -117,6 +139,32 @@ export function ShellSession() {
               className="resize-none bg-background/50 border-border/40 focus:border-primary/50 transition-colors"
               disabled={isLoading}
             />
+
+            {/* Browser engine status */}
+            {showBrowserProgress && (
+              <div className="flex items-center gap-2 text-[11px] font-mono text-muted-foreground/60">
+                <Loader2 className="size-3 animate-spin" />
+                {browserStatus.stage === "downloading" ? (
+                  <div className="flex items-center gap-2 flex-1">
+                    <span>Downloading model...</span>
+                    {typeof browserStatus.progress === "number" && (
+                      <>
+                        <div className="flex-1 max-w-[200px] h-1 bg-border/30 rounded-full overflow-hidden">
+                          <div
+                            className="h-full bg-[#2ea44f] rounded-full transition-all duration-300"
+                            style={{ width: `${browserStatus.progress}%` }}
+                          />
+                        </div>
+                        <span>{browserStatus.progress}%</span>
+                      </>
+                    )}
+                  </div>
+                ) : (
+                  <span>Initializing model...</span>
+                )}
+              </div>
+            )}
+
             <p className="text-[11px] text-muted-foreground/40">
               Press Enter to submit, Shift+Enter for newline
             </p>
@@ -153,15 +201,10 @@ export function ShellSession() {
               )}
             </div>
 
-            <div className="flex items-center gap-2">
-              <InferenceModeSelector
-                mode={mode}
-                onModeChange={setMode}
-                isWebGPUAvailable={isWebGPUAvailable}
-                modelStatus={modelStatus}
-              />
-              <VoiceInput onTranscript={handleVoiceTranscript} disabled={isLoading} />
-            </div>
+            <VoiceInput
+              onTranscript={handleVoiceTranscript}
+              disabled={isLoading}
+            />
           </div>
         </CardContent>
       </Card>
@@ -171,20 +214,17 @@ export function ShellSession() {
         <Card className="border-destructive/30 bg-destructive/5" role="alert">
           <CardContent>
             <div className="flex items-center gap-2">
-              <span className="text-destructive text-sm" aria-hidden="true">&#9888;</span>
+              <span className="text-destructive text-sm" aria-hidden="true">
+                &#9888;
+              </span>
               <p className="text-sm text-destructive/90">{error}</p>
             </div>
           </CardContent>
         </Card>
       )}
 
-      {/* Model loading progress */}
-      {modelStatus === "loading" && (
-        <ModelLoader progress={loadProgress} progressText={loadProgressText} />
-      )}
-
       {/* Loading state */}
-      {isLoading && <AILoader />}
+      {isLoading && !showBrowserProgress && <AILoader />}
 
       {/* Command output */}
       {result && (
@@ -195,13 +235,26 @@ export function ShellSession() {
           onExecute={sandbox.execute}
           isExecuting={sandbox.isExecuting}
           sandboxEnabled={SANDBOX_ENABLED}
-          inferenceSource={inferenceSource}
         />
       )}
 
+      {/* Sandbox execution status */}
+      {sandbox.isExecuting && !sandbox.output && (
+        <div className="flex items-center gap-2 px-4 py-2 text-[11px] font-mono text-muted-foreground/60">
+          <Loader2 className="size-3 animate-spin" />
+          <span>
+            {sandbox.isBooting ? "Booting sandbox..." : "Executing..."}
+          </span>
+        </div>
+      )}
+
       {/* Sandbox execution output */}
       {sandbox.output && result && (
-        <ExecutionOutput result={sandbox.output} command={result.command} />
+        <ExecutionOutput
+          result={sandbox.output}
+          command={result.command}
+          history={sandbox.history.slice(0, -1)}
+        />
       )}
 
       {/* Sandbox error */}
@@ -209,12 +262,18 @@ export function ShellSession() {
         <Card className="border-yellow-500/30 bg-yellow-500/5" role="alert">
           <CardContent>
             <div className="flex items-start gap-2">
-              <span className="text-yellow-500 text-sm mt-0.5" aria-hidden="true">&#9888;</span>
+              <span
+                className="text-yellow-500 text-sm mt-0.5"
+                aria-hidden="true"
+              >
+                &#9888;
+              </span>
               <div>
                 <p className="text-sm text-yellow-500/90">{sandbox.error}</p>
                 {result && (
                   <p className="text-xs text-muted-foreground/50 mt-1">
-                    You can copy the command above and run it in your own terminal.
+                    You can copy the command above and run it in your own
+                    terminal.
                   </p>
                 )}
               </div>
@@ -223,7 +282,7 @@ export function ShellSession() {
         </Card>
       )}
 
-      {/* History */}
+      {/* Translation history */}
       {history.length > 0 && (
         <div className="space-y-3">
           <div className="flex items-center gap-3">
@@ -255,4 +314,3 @@ export function ShellSession() {
     </div>
   );
 }
-
diff --git a/hooks/use-inference.ts b/hooks/use-inference.ts
deleted file mode 100644
index cf68f0c..0000000
--- a/hooks/use-inference.ts
+++ /dev/null
@@ -1,69 +0,0 @@
-"use client";
-
-import { useEffect, useState } from "react";
-import { useTranslate } from "@/hooks/use-translate";
-import { useLocalInference } from "@/hooks/use-local-inference";
-
-export type InferenceMode = "cloud" | "browser" | "auto";
-
-export function useInference() {
-  const remote = useTranslate();
-  const local = useLocalInference();
-  const [mode, setMode] = useState<InferenceMode>("cloud");
-
-  // When user switches to browser mode and model isn't ready, start loading
-  useEffect(() => {
-    if (
-      (mode === "browser" || mode === "auto") &&
-      local.isWebGPUAvailable &&
-      local.modelStatus === "idle"
-    ) {
-      local.loadModel();
-    }
-  }, [mode, local, local.isWebGPUAvailable, local.modelStatus, local.loadModel]);
-
-  const useBrowser =
-    mode === "browser" ||
-    (mode === "auto" && local.modelStatus === "ready");
-
-  const translate = async (query: string) => {
-    if (useBrowser && local.modelStatus === "ready") {
-      return local.translate(query);
-    }
-    return remote.translate(query);
-  };
-
-  const reset = () => {
-    remote.reset();
-    local.reset();
-  };
-
-  const result = useBrowser ? local.result : remote.result;
-  const isLoading = useBrowser ? local.isLoading : remote.isLoading;
-  const error = useBrowser ? local.error : remote.error;
-  const inferenceSource: "cloud" | "browser" = useBrowser ? "browser" : "cloud";
-
-  return {
-    // State
-    result,
-    isLoading,
-    error,
-    inferenceSource,
-
-    // Mode
-    mode,
-    setMode,
-
-    // Local model state
-    isWebGPUAvailable: local.isWebGPUAvailable,
-    modelStatus: local.modelStatus,
-    loadProgress: local.loadProgress,
-    loadProgressText: local.loadProgressText,
-    loadModel: local.loadModel,
-    unloadModel: local.unloadModel,
-
-    // Actions
-    translate,
-    reset,
-  };
-}
diff --git a/hooks/use-local-inference.ts b/hooks/use-local-inference.ts
deleted file mode 100644
index 146e63d..0000000
--- a/hooks/use-local-inference.ts
+++ /dev/null
@@ -1,177 +0,0 @@
-"use client";
-
-import { useCallback, useEffect, useRef, useState } from "react";
-
-type ProgressReport = { progress: number; text: string };
-
-interface TranslateResult {
-  command: string;
-  meta: string;
-}
-
-export type ModelStatus = "idle" | "loading" | "ready" | "error";
-
-interface LocalInferenceState {
-  isWebGPUAvailable: boolean;
-  modelStatus: ModelStatus;
-  loadProgress: number;
-  loadProgressText: string;
-  result: TranslateResult | null;
-  isLoading: boolean;
-  error: string | null;
-}
-
-export function useLocalInference() {
-  const [state, setState] = useState<LocalInferenceState>({
-    isWebGPUAvailable: false,
-    modelStatus: "idle",
-    loadProgress: 0,
-    loadProgressText: "",
-    result: null,
-    isLoading: false,
-    error: null,
-  });
-
-  // Detect WebGPU after mount to avoid SSR hydration mismatch
-  useEffect(() => {
-    const available = "gpu" in navigator;
-    setState((s) => ({ ...s, isWebGPUAvailable: available }));
-  }, []);
-
-  // Keep engine module ref to avoid re-importing
-  const engineModRef = useRef<typeof import("@/lib/browser-engine") | null>(
-    null,
-  );
-
-  const getEngine = useCallback(async () => {
-    if (!engineModRef.current) {
-      engineModRef.current = await import("@/lib/browser-engine");
-    }
-    return engineModRef.current;
-  }, []);
-
-  const loadModel = useCallback(async () => {
-    setState((s) => ({
-      ...s,
-      modelStatus: "loading",
-      loadProgress: 0,
-      loadProgressText: "Initializing WebGPU...",
-      error: null,
-    }));
-
-    try {
-      const eng = await getEngine();
-
-      if (eng.isEngineReady()) {
-        setState((s) => ({
-          ...s,
-          modelStatus: "ready",
-          loadProgress: 100,
-          loadProgressText: "Ready",
-        }));
-        return;
-      }
-
-      await eng.initEngine((report: ProgressReport) => {
-        setState((s) => ({
-          ...s,
-          loadProgress: Math.round(report.progress * 100),
-          loadProgressText: report.text,
-        }));
-      });
-
-      setState((s) => ({
-        ...s,
-        modelStatus: "ready",
-        loadProgress: 100,
-        loadProgressText: "Ready",
-      }));
-    } catch (err) {
-      const raw = err instanceof Error ? err.message : String(err);
-      let message = raw;
-      if (raw.includes("Failed to fetch") || raw.includes("NetworkError")) {
-        message =
-          "Failed to download model. Check your connection or try again.";
-      } else if (
-        raw.includes("WebGPU") ||
-        raw.includes("adapter") ||
-        raw.includes("gpu")
-      ) {
-        message =
-          "WebGPU is not supported on this device. Try Chrome or Edge on a device with a GPU.";
-      }
-      setState((s) => ({
-        ...s,
-        modelStatus: "error",
-        error: message,
-        loadProgress: 0,
-        loadProgressText: "",
-      }));
-    }
-  }, [getEngine]);
-
-  const translate = useCallback(
-    async (query: string) => {
-      if (!query.trim()) return;
-
-      setState((s) => ({ ...s, result: null, isLoading: true, error: null }));
-
-      try {
-        const eng = await getEngine();
-        const { command, meta } = await eng.generate(query.trim());
-
-        if (!command) {
-          setState((s) => ({
-            ...s,
-            isLoading: false,
-            error: "Model returned empty response",
-          }));
-          return;
-        }
-
-        setState((s) => ({
-          ...s,
-          result: { command, meta },
-          isLoading: false,
-        }));
-      } catch (err) {
-        const message =
-          err instanceof Error ? err.message : "Inference failed";
-        setState((s) => ({
-          ...s,
-          isLoading: false,
-          error: message,
-        }));
-      }
-    },
-    [getEngine],
-  );
-
-  const reset = useCallback(() => {
-    setState((s) => ({
-      ...s,
-      result: null,
-      isLoading: false,
-      error: null,
-    }));
-  }, []);
-
-  const unloadModel = useCallback(async () => {
-    const eng = await getEngine();
-    await eng.unloadEngine();
-    setState((s) => ({
-      ...s,
-      modelStatus: "idle",
-      loadProgress: 0,
-      loadProgressText: "",
-    }));
-  }, [getEngine]);
-
-  return {
-    ...state,
-    loadModel,
-    translate,
-    reset,
-    unloadModel,
-  };
-}
diff --git a/hooks/use-translate.ts b/hooks/use-translate.ts
index 08d0ebf..e9860bb 100644
--- a/hooks/use-translate.ts
+++ b/hooks/use-translate.ts
@@ -1,6 +1,10 @@
 "use client";
 
-import { useCallback, useState } from "react";
+import { useCallback, useEffect, useState } from "react";
+import type { BrowserEngineStatus } from "@/lib/browser-engine";
+
+export type InferenceMode = "cloud" | "browser" | "auto";
+export type { BrowserEngineStatus };
 
 interface TranslateResult {
   command: string;
@@ -11,54 +15,94 @@ interface TranslateState {
   result: TranslateResult | null;
   isLoading: boolean;
   error: string | null;
+  browserStatus: BrowserEngineStatus;
+}
+
+async function translateViaCloud(query: string): Promise<TranslateResult> {
+  const res = await fetch("/api/translate", {
+    method: "POST",
+    headers: { "Content-Type": "application/json" },
+    body: JSON.stringify({ query: query.trim() }),
+  });
+  const data = await res.json();
+  if (!res.ok) throw new Error(data.error || "Translation failed");
+  return { command: data.command, meta: data.meta };
+}
+
+async function translateViaBrowser(query: string): Promise<TranslateResult> {
+  const engine = await import("@/lib/browser-engine");
+  const command = await engine.generate(query.trim());
+  if (!command)
+    throw new Error(
+      "Browser model returned an empty response. Try Cloud mode.",
+    );
+  return { command, meta: "Generated locally in your browser" };
 }
 
-export function useTranslate() {
+export function useTranslate(mode: InferenceMode = "cloud") {
   const [state, setState] = useState<TranslateState>({
     result: null,
     isLoading: false,
     error: null,
+    browserStatus: { stage: "idle" },
   });
 
-  const translate = useCallback(async (query: string) => {
-    if (!query.trim()) return;
+  useEffect(() => {
+    if (mode === "cloud") return;
 
-    setState({ result: null, isLoading: true, error: null });
+    let unsubscribe: (() => void) | null = null;
+    import("@/lib/browser-engine")
+      .then((engine) => {
+        setState((s) => ({ ...s, browserStatus: engine.getStatus() }));
+        unsubscribe = engine.onStatusChange((status) => {
+          setState((s) => ({ ...s, browserStatus: status }));
+        });
+      })
+      .catch(() => {});
+    return () => unsubscribe?.();
+  }, [mode]);
+
+  const translate = useCallback(
+    async (query: string) => {
+      if (!query.trim()) return;
+      setState((s) => ({ ...s, result: null, isLoading: true, error: null }));
 
-    try {
-      const res = await fetch("/api/translate", {
-        method: "POST",
-        headers: { "Content-Type": "application/json" },
-        body: JSON.stringify({ query: query.trim() }),
-      });
+      try {
+        let result: TranslateResult;
 
-      const data = await res.json();
+        if (mode === "browser") {
+          result = await translateViaBrowser(query);
+        } else if (mode === "auto") {
+          const engine = await import("@/lib/browser-engine").catch(
+            () => null,
+          );
+          result = engine?.isReady()
+            ? await translateViaBrowser(query)
+            : await translateViaCloud(query);
+        } else {
+          result = await translateViaCloud(query);
+        }
 
-      if (!res.ok) {
-        setState({
+        setState((s) => ({ ...s, result, isLoading: false, error: null }));
+      } catch (err) {
+        const message =
+          err instanceof Error ? err.message : "Translation failed";
+        setState((s) => ({
+          ...s,
           result: null,
           isLoading: false,
-          error: data.error || "Translation failed",
-        });
-        return;
+          error:
+            err instanceof TypeError
+              ? "Network error. Please check your connection."
+              : message,
+        }));
       }
-
-      setState({
-        result: { command: data.command, meta: data.meta },
-        isLoading: false,
-        error: null,
-      });
-    } catch {
-      setState({
-        result: null,
-        isLoading: false,
-        error: "Network error. Please check your connection.",
-      });
-    }
-  }, []);
+    },
+    [mode],
+  );
 
   const reset = useCallback(() => {
-    setState({ result: null, isLoading: false, error: null });
+    setState((s) => ({ ...s, result: null, isLoading: false, error: null }));
   }, []);
 
   return { ...state, translate, reset };
diff --git a/hooks/use-webcontainer.ts b/hooks/use-webcontainer.ts
new file mode 100644
index 0000000..3783403
--- /dev/null
+++ b/hooks/use-webcontainer.ts
@@ -0,0 +1,107 @@
+"use client";
+
+import { useCallback, useRef, useState } from "react";
+import type { ExecutionResult } from "@/types/sandbox";
+
+interface WebContainerState {
+  isReady: boolean;
+  isBooting: boolean;
+  isExecuting: boolean;
+  output: ExecutionResult | null;
+  error: string | null;
+  history: Array<{
+    command: string;
+    stdout: string;
+    exitCode: number;
+    timestamp: number;
+  }>;
+}
+
+export function useWebContainer() {
+  const [state, setState] = useState<WebContainerState>({
+    isReady: false,
+    isBooting: false,
+    isExecuting: false,
+    output: null,
+    error: null,
+    history: [],
+  });
+
+  const sandboxRef = useRef<typeof import("@/lib/webcontainer-sandbox") | null>(
+    null,
+  );
+
+  const getSandbox = useCallback(async () => {
+    if (!sandboxRef.current) {
+      sandboxRef.current = await import("@/lib/webcontainer-sandbox");
+    }
+    return sandboxRef.current;
+  }, []);
+
+  const boot = useCallback(async () => {
+    setState((s) => ({ ...s, isBooting: true, error: null }));
+    try {
+      const sb = await getSandbox();
+      await sb.bootSandbox();
+      setState((s) => ({ ...s, isReady: true, isBooting: false }));
+    } catch (err) {
+      setState((s) => ({
+        ...s,
+        isBooting: false,
+        error:
+          err instanceof Error ? err.message : "Failed to boot sandbox",
+      }));
+    }
+  }, [getSandbox]);
+
+  const execute = useCallback(
+    async (command: string) => {
+      setState((s) => ({ ...s, isExecuting: true, output: null, error: null }));
+      try {
+        const sb = await getSandbox();
+        if (!sb.isSandboxReady()) {
+          setState((s) => ({ ...s, isBooting: true }));
+          await sb.bootSandbox();
+          setState((s) => ({ ...s, isReady: true, isBooting: false }));
+        }
+
+        const result = await sb.execCommand(command);
+
+        const executionResult: ExecutionResult = {
+          stdout: result.stdout,
+          stderr: result.stderr,
+          exitCode: result.exitCode,
+          durationMs: result.durationMs,
+        };
+
+        setState((s) => ({
+          ...s,
+          isExecuting: false,
+          output: executionResult,
+          history: [
+            ...s.history,
+            {
+              command,
+              stdout: result.stdout,
+              exitCode: result.exitCode,
+              timestamp: Date.now(),
+            },
+          ],
+        }));
+      } catch (err) {
+        setState((s) => ({
+          ...s,
+          isExecuting: false,
+          error: err instanceof Error ? err.message : "Execution failed",
+        }));
+      }
+    },
+    [getSandbox],
+  );
+
+  const clearOutput = useCallback(() => {
+    setState((s) => ({ ...s, output: null, error: null }));
+  }, []);
+
+  return { ...state, boot, execute, clearOutput };
+}
diff --git a/lib/browser-engine.ts b/lib/browser-engine.ts
index 112e836..0c47150 100644
--- a/lib/browser-engine.ts
+++ b/lib/browser-engine.ts
@@ -1,107 +1,115 @@
 "use client";
 
-import type { ProgressInfo, TextGenerationPipeline } from "@huggingface/transformers";
+import { pipeline, TextGenerationPipeline } from "@huggingface/transformers";
 import { cleanResponse } from "@/lib/clean-response";
 
-// Use dynamic import to avoid SSR issues
-let pipelineInstance: TextGenerationPipeline | null = null;
-let loadingPromise: Promise<TextGenerationPipeline> | null = null;
+const MODEL_ID = "onnx-community/Qwen2.5-0.5B-Instruct";
 
-const MODEL_ID = "onnx-community/Qwen3.5-0.8B-ONNX";
+const SYSTEM_PROMPT = `You are NL2Shell, a tool that converts natural language to shell commands.
+Rules:
+- Output ONLY the shell command, nothing else
+- No explanations, no markdown, no code fences
+- If the request is ambiguous, pick the most common interpretation
+- Use standard Unix/Linux commands`;
 
-const SYSTEM_PROMPT = `/no_think
-You are nl2shell, a specialist that converts natural language to shell commands. Output ONLY the exact shell command. No explanations, no markdown, no backticks, no reasoning, no thinking. Just the command.`;
+export interface BrowserEngineStatus {
+  stage: "idle" | "downloading" | "loading" | "ready" | "generating" | "error";
+  progress?: number;
+  error?: string;
+}
 
-export type ProgressCallback = (progress: {
-  progress: number;
-  text: string;
-}) => void;
+type StatusCallback = (status: BrowserEngineStatus) => void;
 
-export function isEngineReady(): boolean {
+let pipelineInstance: TextGenerationPipeline | null = null;
+let loadPromise: Promise<TextGenerationPipeline> | null = null;
+let currentStatus: BrowserEngineStatus = { stage: "idle" };
+const listeners = new Set<StatusCallback>();
+
+function setStatus(status: BrowserEngineStatus) {
+  currentStatus = status;
+  for (const cb of listeners) cb(status);
+}
+
+export function getStatus(): BrowserEngineStatus {
+  return currentStatus;
+}
+
+export function isReady(): boolean {
   return pipelineInstance !== null;
 }
 
-export async function initEngine(onProgress?: ProgressCallback): Promise<void> {
-  if (pipelineInstance) return;
-  if (loadingPromise) {
-    await loadingPromise;
-    return;
-  }
+export function onStatusChange(cb: StatusCallback): () => void {
+  listeners.add(cb);
+  return () => listeners.delete(cb);
+}
 
-  loadingPromise = (async () => {
-    const { pipeline, env } = await import("@huggingface/transformers");
+async function loadPipeline(): Promise<TextGenerationPipeline> {
+  if (pipelineInstance) return pipelineInstance;
+  if (loadPromise) return loadPromise;
 
-    // Disable local model check (always fetch from HF Hub)
-    env.allowLocalModels = false;
+  loadPromise = (async () => {
+    setStatus({ stage: "downloading", progress: 0 });
 
-    const generator = await pipeline("text-generation", MODEL_ID, {
-      dtype: "q4",
+    const pipe = await pipeline("text-generation", MODEL_ID, {
+      dtype: "q4f16",
       device: "webgpu",
-      progress_callback: onProgress
-        ? (data: ProgressInfo) => {
-            if (data.status === "progress") {
-              onProgress({
-                progress: data.progress ?? 0,
-                text: data.file ? `Loading ${data.file}...` : "Initializing...",
-              });
-            }
-          }
-        : undefined,
+      progress_callback: (progress: { progress?: number; status?: string }) => {
+        if (progress.status === "progress" && typeof progress.progress === "number") {
+          setStatus({ stage: "downloading", progress: Math.round(progress.progress) });
+        }
+      },
     });
 
-    return generator;
+    setStatus({ stage: "loading" });
+    pipelineInstance = pipe as TextGenerationPipeline;
+    setStatus({ stage: "ready" });
+    return pipelineInstance;
   })();
 
   try {
-    pipelineInstance = await loadingPromise;
+    return await loadPromise;
   } catch (err) {
-    loadingPromise = null;
+    loadPromise = null;
+    const message = err instanceof Error ? err.message : "Failed to load model";
+    setStatus({ stage: "error", error: message });
     throw err;
   }
 }
 
-export async function generate(query: string): Promise<{
-  command: string;
-  meta: string;
-  durationMs: number;
-}> {
-  if (!pipelineInstance) {
-    throw new Error("Engine not initialized. Call initEngine() first.");
-  }
-
-  const start = performance.now();
+export async function generate(query: string): Promise<string> {
+  const pipe = await loadPipeline();
+  setStatus({ stage: "generating" });
 
   const messages = [
-    { role: "system" as const, content: SYSTEM_PROMPT },
-    { role: "user" as const, content: query },
+    { role: "system", content: SYSTEM_PROMPT },
+    { role: "user", content: query },
   ];
 
-  const result = await pipelineInstance(messages, {
+  const result = await pipe(messages, {
     max_new_tokens: 128,
     temperature: 0.1,
     do_sample: true,
     return_full_text: false,
   });
 
-  const durationMs = Math.round(performance.now() - start);
-
-  const output = Array.isArray(result) ? result[0] : result;
-  const generatedText = (output as { generated_text?: unknown })?.generated_text;
-  const raw =
-    Array.isArray(generatedText)
-      ? ((generatedText.at(-1) as { content?: string })?.content ?? "")
-      : (typeof generatedText === "string" ? generatedText : "");
-
-  const command = cleanResponse(raw);
-  const meta = `Browser (WebGPU) | Qwen3.5-0.8B | ${durationMs}ms`;
-
-  return { command, meta, durationMs };
-}
-
-export async function unloadEngine(): Promise<void> {
-  if (pipelineInstance) {
-    await pipelineInstance.dispose();
-    pipelineInstance = null;
-    loadingPromise = null;
+  setStatus({ stage: "ready" });
+
+  // Transformers.js text-generation with chat messages returns:
+  // [{ generated_text: [{ role, content }, ...] }]  (chat mode)
+  // OR [{ generated_text: "string" }]                (plain mode)
+  const output = result[0]?.generated_text;
+
+  let raw: string;
+  if (Array.isArray(output)) {
+    // Chat mode: find the assistant's reply (last message)
+    const lastMsg = output.at(-1);
+    raw =
+      typeof lastMsg === "object" && lastMsg !== null && "content" in lastMsg
+        ? String(lastMsg.content)
+        : "";
+  } else {
+    raw = typeof output === "string" ? output : "";
   }
+
+  return cleanResponse(raw);
 }
diff --git a/lib/clean-response.ts b/lib/clean-response.ts
index 12be6c8..81bee4d 100644
--- a/lib/clean-response.ts
+++ b/lib/clean-response.ts
@@ -1,10 +1,12 @@
 export function cleanResponse(text: string): string {
   let cleaned = text.trim();
 
-  // Remove <think>...</think> reasoning blocks (Qwen3 thinking mode)
-  cleaned = cleaned.replace(/<think>[\s\S]*?<\/think>\s*/g, "");
+  // Strip <think>...</think> blocks (Qwen reasoning tags)
+  // Handles: empty blocks, blocks with content, multiline content
+  cleaned = cleaned.replace(/^<think>[\s\S]*?<\/think>\s*/, "");
   // Remove unclosed <think> blocks (model truncated mid-thought)
   cleaned = cleaned.replace(/<think>[\s\S]*/g, "");
+  cleaned = cleaned.trim();
 
   // Remove markdown code fences (```bash ... ```)
   cleaned = cleaned.replace(/^```(?:bash|sh|shell|zsh)?\n?/, "");
diff --git a/lib/webcontainer-sandbox.ts b/lib/webcontainer-sandbox.ts
new file mode 100644
index 0000000..adfa0ce
--- /dev/null
+++ b/lib/webcontainer-sandbox.ts
@@ -0,0 +1,67 @@
+"use client";
+
+import { WebContainer } from "@webcontainer/api";
+
+let container: WebContainer | null = null;
+let bootPromise: Promise<WebContainer> | null = null;
+
+export async function bootSandbox(): Promise<void> {
+  if (container) return;
+  if (bootPromise) {
+    await bootPromise;
+    return;
+  }
+
+  bootPromise = WebContainer.boot();
+  try {
+    container = await bootPromise;
+    await container.mount({
+      workspace: { directory: {} },
+    });
+  } catch (err) {
+    bootPromise = null;
+    throw err;
+  }
+}
+
+export function isSandboxReady(): boolean {
+  return container !== null;
+}
+
+export interface ExecResult {
+  stdout: string;
+  stderr: string;
+  exitCode: number;
+  durationMs: number;
+}
+
+export async function execCommand(command: string): Promise<ExecResult> {
+  if (!container) throw new Error("Sandbox not booted");
+
+  const start = performance.now();
+  const process = await container.spawn("sh", ["-c", command], {
+    cwd: "/workspace",
+  });
+
+  let stdout = "";
+  process.output.pipeTo(
+    new WritableStream({
+      write(chunk) {
+        stdout += chunk;
+      },
+    }),
+  );
+
+  const exitCode = await process.exit;
+  const durationMs = Math.round(performance.now() - start);
+
+  return { stdout: stdout.trim(), stderr: "", exitCode, durationMs };
+}
+
+export async function teardownSandbox(): Promise<void> {
+  if (container) {
+    container.teardown();
+    container = null;
+    bootPromise = null;
+  }
+}
diff --git a/next.config.ts b/next.config.ts
index dd202b7..429dc3f 100644
--- a/next.config.ts
+++ b/next.config.ts
@@ -1,5 +1,17 @@
 import type { NextConfig } from "next";
 
+const csp = [
+  "default-src 'self'",
+  "script-src 'self' 'unsafe-inline' 'unsafe-eval'",
+  "style-src 'self' 'unsafe-inline'",
+  "img-src 'self' data: blob:",
+  "font-src 'self' data:",
+  "connect-src 'self' https://*.huggingface.co https://*.webcontainer-api.io https://*.stackblitz.io https://va.vercel-scripts.com https://vitals.vercel-insights.com",
+  "worker-src 'self' blob:",
+  "child-src 'self' blob:",
+  "frame-ancestors 'none'",
+].join("; ");
+
 const nextConfig: NextConfig = {
   serverExternalPackages: ["@huggingface/transformers"],
   async headers() {
@@ -15,9 +27,17 @@ const nextConfig: NextConfig = {
             key: "Permissions-Policy",
             value: "camera=(), geolocation=(), microphone=(self)",
           },
+          {
+            key: "Cross-Origin-Embedder-Policy",
+            value: "credentialless",
+          },
+          {
+            key: "Cross-Origin-Opener-Policy",
+            value: "same-origin",
+          },
           {
             key: "Content-Security-Policy",
-            value: "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval' 'wasm-unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src 'self' data: blob:; font-src 'self' data:; connect-src 'self' https://*.huggingface.co https://huggingface.co https://raw.githubusercontent.com; worker-src 'self' blob:; frame-ancestors 'none';",
+            value: csp,
           },
         ],
       },
diff --git a/package-lock.json b/package-lock.json
index 1d0d060..f4d26b4 100644
--- a/package-lock.json
+++ b/package-lock.json
@@ -14,6 +14,7 @@
         "@supabase/supabase-js": "^2.99.1",
         "@vercel/analytics": "^2.0.0",
         "@vercel/speed-insights": "^2.0.0",
+        "@webcontainer/api": "^1.6.1",
         "class-variance-authority": "^0.7.1",
         "clsx": "^2.1.1",
         "lucide-react": "^1.7.0",
@@ -6635,6 +6636,12 @@
       "license": "MIT",
       "peer": true
     },
+    "node_modules/@webcontainer/api": {
+      "version": "1.6.1",
+      "resolved": "https://registry.npmjs.org/@webcontainer/api/-/api-1.6.1.tgz",
+      "integrity": "sha512-2RS2KiIw32BY1Icf6M1DvqSmcon9XICZCDgS29QJb2NmF12ZY2V5Ia+949hMKB3Wno+P/Y8W+sPP59PZeXSELg==",
+      "license": "MIT"
+    },
     "node_modules/abbrev": {
       "version": "3.0.1",
       "license": "ISC",
diff --git a/package.json b/package.json
index d84fc23..cc85b5c 100644
--- a/package.json
+++ b/package.json
@@ -15,6 +15,7 @@
     "@supabase/supabase-js": "^2.99.1",
     "@vercel/analytics": "^2.0.0",
     "@vercel/speed-insights": "^2.0.0",
+    "@webcontainer/api": "^1.6.1",
     "class-variance-authority": "^0.7.1",
     "clsx": "^2.1.1",
     "lucide-react": "^1.7.0",
diff --git a/tsconfig.json b/tsconfig.json
index 3a13f90..fdda534 100644
--- a/tsconfig.json
+++ b/tsconfig.json
@@ -30,5 +30,5 @@
     ".next/dev/types/**/*.ts",
     "**/*.mts"
   ],
-  "exclude": ["node_modules"]
+  "exclude": ["node_modules", "relay"]
 }
diff --git a/types/sandbox.d.ts b/types/sandbox.d.ts
index 8c4fd74..d02636a 100644
--- a/types/sandbox.d.ts
+++ b/types/sandbox.d.ts
@@ -10,7 +10,7 @@ export interface ExecutionResult {
   stderr: string;
   exitCode: number;
   durationMs: number;
-  auditId: string;
+  auditId?: string;
 }
 
 export interface ExecRequest {