Skip to content

Add Create, Control, Observe introduction guides#370

Open
ehfeng wants to merge 7 commits into
mainfrom
hypeship/intro-create-control-observe
Open

Add Create, Control, Observe introduction guides#370
ehfeng wants to merge 7 commits into
mainfrom
hypeship/intro-create-control-observe

Conversation

@ehfeng
Copy link
Copy Markdown
Contributor

@ehfeng ehfeng commented May 20, 2026

Summary

Adds three high-level concept guides under the Introduction group to orient new users around Kernel's core surface area:

  • Create (introduction/create.mdx) — spinning up a browser, picking the right shape (headless, stealth, GPU, profiles), and lifecycle.
  • Control (introduction/control.mdx) — the two ways to drive a browser: computer controls (model-native, vision loops) vs Playwright execution (DOM, full-page screenshots). Includes a when-to-use-which table and the recommended pattern of using both together.
  • Observe (introduction/observe.mdx) — live view, replays, screenshots, and invocation logs, framed by what each is good for.

docs.json is updated to rename the first nav group from home to Introduction and add the three new pages alongside index.

Preview

Once Mintlify builds: https://tbd-6fc993ce-hypeship-intro-create-control-observe.mintlify.app/introduction/create (and /control, /observe).

Test plan

  • mintlify dev renders the three new pages
  • All cross-page links resolve (anchors verified against current H2/H3s)
  • Sidebar shows "Introduction" group with index + three new pages in order

Note

Low Risk
Low risk documentation-only change that reorganizes onboarding content and updates internal links/redirects; primary risk is broken links or navigation regressions if any references were missed.

Overview
Adds a new Introduction docs section with three concept guides: introduction/create, introduction/control, and introduction/observe, and updates the landing page to point users to these new starting points.

Reorganizes navigation in docs.json, removes the legacy browsers/create-a-browser.mdx page, and adds a redirect from /browsers/create-a-browser to /introduction/create. Updates cross-doc links (auth FAQ, integrations, scaling/pools, changelog) and clarifies browser lifecycle wording around standby/timeout semantics in browsers/standby and browsers/termination.

Reviewed by Cursor Bugbot for commit 81d0027. Bugbot is set up for automated code reviews on this repo. Configure here.

Introduce three high-level concept guides under the Introduction
group that orient new users around the core surface area: creating
browsers, driving them (computer controls vs playwright execution),
and observing them (live view, replays, screenshots, logs).
@mintlify
Copy link
Copy Markdown
Contributor

mintlify Bot commented May 20, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
Kernel 🟢 Ready View Preview May 20, 2026, 7:36 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

cursor[bot]
cursor Bot approved these changes May 20, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Risk assessment: Very Low

The current diff is limited to documentation and navigation: new introduction/create, introduction/control, and introduction/observe MDX pages, a small docs.json sidebar update, and wording clarifications in browsers/standby.mdx and browsers/termination.mdx. I found no production codepaths, infrastructure, auth/security, data, prompt/instruction files, or shared runtime behavior changes.

No CODEOWNERS file was present in the repo. The PR is already approved at the current head commit, so I’m not submitting another approval.

Open in Web View Automation 

Sent by Cursor Automation: Assign PR reviewers

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Risk assessment: Very Low

The current diff is limited to documentation and Mintlify navigation: three new introduction/* MDX guides, a redirect/sidebar update in docs.json, link retargeting from the removed browsers/create-a-browser.mdx page, and small wording clarifications for standby/termination behavior. The .docs/code-example-guide.md change only updates a reference to the moved guide path.

I found no production codepaths, shared runtime behavior, infrastructure, auth/security, data/schema changes, or non-trivial prompt/instruction changes. No CODEOWNERS file was present. GitHub already reports this PR as approved, so I’m not submitting another approval.

Open in Web View Automation 

Sent by Cursor Automation: Assign PR reviewers

@masnwilliams masnwilliams marked this pull request as ready for review May 21, 2026 04:02
@firetiger-agent
Copy link
Copy Markdown

Firetiger deploy monitoring skipped

This PR didn't match the auto-monitor filter configured on your GitHub connection:

Any PR that changes the kernel API. Monitor changes to API endpoints (packages/api/cmd/api/) and Temporal workflows (packages/api/lib/temporal) in the kernel repo

Reason: PR only adds documentation guides and updates docs.json; no changes to API endpoints or Temporal workflows.

To monitor this PR anyway, reply with @firetiger monitor this.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Risk assessment: Very Low

The current diff is limited to Mintlify documentation changes: new introduction/create, introduction/control, and introduction/observe guides, removal of the old browsers/create-a-browser page with a redirect to introduction/create, navigation updates in docs.json, link retargeting, and small wording clarifications for standby/termination behavior.

I found no production codepaths, infrastructure, auth/security, data/schema changes, or shared runtime behavior changes. The only .docs change updates a reference path in the code example guide, which is a trivial documentation-maintenance edit. No CODEOWNERS file was present.

GitHub already reports this PR as approved, so I’m leaving this assessment as a comment and not submitting another approval.

Open in Web View Automation 

Sent by Cursor Automation: Assign PR reviewers

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 4e62b81. Configure here.

Comment thread introduction/control.mdx Outdated

<Tabs>
<Tab title="Computer Controls">
OS-level mouse, keyboard, and screen — the primitives a computer-use model already speaks (screenshot, click, type, key, scroll, drag). No CDP or WebDriver connection required, so there's no protocol fingerprint to leak. Ideal for [Claude](/integrations/computer-use/anthropic), [OpenAI](/integrations/computer-use/openai), or [Gemini](/integrations/computer-use/gemini) computer-use loops.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third-party AI product names used instead of generic terms

Low Severity

The new introduction/control.mdx page names specific third-party AI products — "Claude", "OpenAI", "Gemini", and "Anthropic" — directly in the documentation text. The project rule requires vendor-neutral, generic terms (e.g. "computer-use models" or "your preferred model provider") instead of specific product names. This appears on line 10 ("Ideal for [Claude]… [OpenAI]… or [Gemini] computer-use loops") and line 163 ("drop-in examples for Anthropic, Gemini, OpenAI, and more").

Additional Locations (1)
Fix in Cursor Fix in Web

Triggered by learned rule: Use generic terms instead of specific third-party AI product names

Reviewed by Cursor Bugbot for commit 4e62b81. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Risk assessment: Very Low

I re-evaluated the current head diff (c5fd931). The changes remain limited to Mintlify documentation and navigation: new introduction/create, introduction/control, and introduction/observe guides, removal of the old browsers/create-a-browser page with a redirect, link retargeting, and small wording clarifications around standby/termination and “computer use.”

I found no production codepaths, infrastructure, auth/security, data/schema changes, or shared runtime behavior changes. The .docs/code-example-guide.md edit is only a reference-path update. I also found no CODEOWNERS file in the repo.

GitHub already reports this PR as approved, and this update does not increase the risk, so I’m leaving this as a comment and not submitting another approval.

Open in Web View Automation 

Sent by Cursor Automation: Assign PR reviewers

Comment thread introduction/create.mdx Outdated
description: "Spin up a cloud browser for your agent"
---

Kernel browsers are sandboxed Chromium instances that boot in under 30ms. Your agent creates them on demand, drives them, and tears them down — no infra to provision, no warm pool to manage.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Kernel browsers are sandboxed Chromium instances that boot in under 30ms. Your agent creates them on demand, drives them, and tears them down — no infra to provision, no warm pool to manage.
Kernel browsers are sandboxed Chromium instances that boot in under 30ms. Your agent creates them on demand, drives them, and tears them down — no infra to provision, no servers to run.

Comment thread introduction/create.mdx Outdated

<Columns cols={2}>
<Card title="Headless vs headful" href="/browsers/headless">
Headful (the default) supports live view and replays. Headless is lighter (1 GB vs 8 GB) and faster — good for short-lived or highly concurrent jobs.
Copy link
Copy Markdown
Contributor

@dprevoznik dprevoznik May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Headful (the default) supports live view and replays. Headless is lighter (1 GB vs 8 GB) and faster — good for short-lived or highly concurrent jobs.
headful (default) supports live view, replays, and better stealth — ideal for agent workflows on bot-detected sites. headless is lighter (1 gb vs 8 gb), good for simple scraping.

Comment thread introduction/create.mdx Outdated

## Lifecycle

A browser stays alive as long as something is driving it — a CDP or WebDriver client, a [Live View](/browsers/live-view) viewer, or an in-flight [computer controls](/browsers/computer-controls) request. After five seconds with none of those active, it enters [standby](/browsers/standby) — state is preserved, billing stops. After the configurable timeout (60s by default) it's deleted.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A browser stays alive as long as something is driving it — a CDP or WebDriver client, a [Live View](/browsers/live-view) viewer, or an in-flight [computer controls](/browsers/computer-controls) request. After five seconds with none of those active, it enters [standby](/browsers/standby) — state is preserved, billing stops. After the configurable timeout (60s by default) it's deleted.
A browser stays alive as long as something is driving it — a CDP or WebDriver client, a [Live View](/browsers/live-view) viewer, or an in-flight [computer controls](/browsers/computer-controls) request. After five seconds with none of those active, it enters [standby](/browsers/standby) — state is preserved, billing stops. Once in standby, after the configurable timeout (60s by default) elapses it's deleted.

Comment thread introduction/create.mdx Outdated

A browser stays alive as long as something is driving it — a CDP or WebDriver client, a [Live View](/browsers/live-view) viewer, or an in-flight [computer controls](/browsers/computer-controls) request. After five seconds with none of those active, it enters [standby](/browsers/standby) — state is preserved, billing stops. After the configurable timeout (60s by default) it's deleted.

You can also delete a browser explicitly when you're done:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can also delete a browser explicitly when you're done:
We recommend you delete a browser explicitly when you're done with it:

Comment thread introduction/create.mdx Outdated
Putting it together — create, connect over CDP, do work, tear down:

<Info>
Kernel browsers launch with a default context and page. Make sure to access the existing context and page (`contexts()[0]` and `pages()[0]`) rather than creating a new one.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like one of the goals with this PR is to also promote playwright execution API instead of a direct playwright connection. I think it's worth showing that example here instead of a direct playwright connection. Thoughts?

Comment thread introduction/control.mdx Outdated
description: "Drive the browser with computer use, CDP, or WebDriver BiDi"
---

Kernel browsers expose three control primitives. For agents, we recommend [computer use](/browsers/computer-controls) — the primitives match how computer-use models were trained to drive a computer, and they sidestep the bot-detection surface that CDP introduces.
Copy link
Copy Markdown
Contributor

@dprevoznik dprevoznik May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we think of having a forth tab for Playwright Execution API here as well in 2nd position?

Comment thread introduction/control.mdx Outdated

Kernel's computer controls are built to match how computer-use models were trained — the same primitives the model emits (screenshot, click at coords, type, key, scroll, drag) map 1:1 onto the API. There's no harness translating model output into framework calls.

- **Native fit.** Screenshot, click, type, key, scroll, drag — the primitives the model already speaks. Kernel uses these same controls in its own [managed auth](/auth/overview) agent.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Native fit.** Screenshot, click, type, key, scroll, drag — the primitives the model already speaks. Kernel uses these same controls in its own [managed auth](/auth/overview) agent.
- **Native fit.** Screenshot, click, type, key, scroll, drag — the primitives the model already speaks.

Comment thread introduction/control.mdx Outdated

The two things computer controls don't do natively: read the DOM, and take a full-page screenshot. The recommended pattern for agents is **computer controls for interaction, [Playwright execution](/browsers/playwright-execution) as a DOM-reading tool** when the agent needs structured data.

Playwright execution runs arbitrary Playwright code in a fresh context inside the browser's VM. Your agent can call it as a tool whenever it needs structured DOM data or a full-page capture, then go right back to driving with computer controls. It ships with [Patchright](/browsers/bot-detection/stealth) by default, so DOM-side calls are hardened against bot detection too.
Copy link
Copy Markdown
Contributor

@dprevoznik dprevoznik May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Playwright execution runs arbitrary Playwright code in a fresh context inside the browser's VM. Your agent can call it as a tool whenever it needs structured DOM data or a full-page capture, then go right back to driving with computer controls. It ships with [Patchright](/browsers/bot-detection/stealth) by default, so DOM-side calls are hardened against bot detection too.
Playwright execution runs arbitrary Playwright code in a fresh context inside the browser's VM. Your agent can call it as a tool whenever it needs structured DOM data, then go right back to driving with computer controls.

Comment thread introduction/control.mdx Outdated

## Computer use + Playwright execution

The two things computer controls don't do natively: read the DOM, and take a full-page screenshot. The recommended pattern for agents is **computer controls for interaction, [Playwright execution](/browsers/playwright-execution) as a DOM-reading tool** when the agent needs structured data.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The two things computer controls don't do natively: read the DOM, and take a full-page screenshot. The recommended pattern for agents is **computer controls for interaction, [Playwright execution](/browsers/playwright-execution) as a DOM-reading tool** when the agent needs structured data.
The one thing computer controls don't do natively: read the DOM. The recommended pattern for agents is computer controls for interaction, [Playwright execution](/browsers/playwright-execution) as a DOM-reading tool when the agent needs structured data.

Comment thread introduction/control.mdx Outdated
@@ -0,0 +1,163 @@
---
title: "Control"
description: "Drive the browser with computer use, CDP, or WebDriver BiDi"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we feel that the two sections below this flow well enough with the three options presented? It's all only relevant for the computer use choice, but if you flip to cdp or bidi up top, it no longer is relevant.

Comment thread introduction/observe.mdx Outdated
```
</CodeGroup>

Add `?readOnly=true` for a non-interactive view, or enable [kiosk mode](/browsers/live-view#kiosk-mode) at creation for a fullscreen, chromeless experience. Full reference: [Live View](/browsers/live-view).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Add `?readOnly=true` for a non-interactive view, or enable [kiosk mode](/browsers/live-view#kiosk-mode) at creation for a fullscreen, chromeless experience. Full reference: [Live View](/browsers/live-view).
Add `?readOnly=true` for a non-interactive view, or enable [kiosk mode](/browsers/live-view#kiosk-mode) at creation for a fullscreen, cinematic experience. Full reference: [Live View](/browsers/live-view).

Comment thread introduction/observe.mdx Outdated

## Replays

Replays are MP4 recordings you start and stop on demand. They're the right tool for post-hoc debugging: a failed run produces a video you can scrub through, share, or attach to a bug report.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Replays are MP4 recordings you start and stop on demand. They're the right tool for post-hoc debugging: a failed run produces a video you can scrub through, share, or attach to a bug report.
Replays are MP4 recordings you start and stop on demand — capture as many clips per session as you need. They're the right tool for post-hoc debugging: a failed run gives you one or more videos to scrub through, share, or attach to a bug report.

Comment thread introduction/observe.mdx
## Replays

Replays are MP4 recordings you start and stop on demand. They're the right tool for post-hoc debugging: a failed run produces a video you can scrub through, share, or attach to a bug report.

Copy link
Copy Markdown
Contributor

@dprevoznik dprevoznik May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Replays can also be enabled on managed auth sessions, so you can [debug failed logins](https://www.kernel.sh/docs/auth/configuration#record-sessions-for-debugging) the same way.

Copy link
Copy Markdown
Contributor

@dprevoznik dprevoznik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions and a few open-ended questions

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk assessment: Very Low

I re-evaluated the current diff at head 296de55. The changes are limited to Mintlify documentation and navigation: new introduction/create, introduction/control, and introduction/observe guides, removal of browsers/create-a-browser with a redirect to introduction/create, sidebar/card/link retargeting, and small wording updates around standby and termination behavior.

I found no production codepaths, infrastructure, auth/security, data/schema changes, or shared runtime behavior changes. The .docs/code-example-guide.md edit is a trivial reference-path update, not a substantive prompt/instruction change. I also found no CODEOWNERS file in the repo.

GitHub already reports this PR as approved, and the new diff does not increase risk, so I’m leaving this as a comment and not submitting another approval.

Open in Web View Automation 

Sent by Cursor Automation: Assign PR reviewers

Add a Playwright Execution tab on Control (position 2, between
Computer Use and CDP) and a parallel "Why playwright execution over a
direct CDP connection" section. Swap the Create full example to use
kernel.browsers.playwright.execute so the recommended path is the one
readers see first.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk assessment: Very Low

I re-evaluated the current diff at head 81d0027. The changes remain limited to Mintlify documentation and navigation: three new introduction/* guides, removal of the old browsers/create-a-browser page with a redirect, sidebar/card/link retargeting, and small wording updates around standby, termination, and browser control options.

I found no production codepaths, infrastructure, auth/security, data/schema changes, or shared runtime behavior changes. The .docs/code-example-guide.md edit is only a reference-path update, not a substantive prompt/instruction change. I also found no CODEOWNERS file in the repo.

GitHub already reports this PR as approved, and the latest update does not increase risk, so I’m leaving this as a comment and not submitting another approval.

Open in Web View Automation 

Sent by Cursor Automation: Assign PR reviewers

@masnwilliams masnwilliams requested a review from dprevoznik May 21, 2026 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants