Add Create, Control, Observe introduction guides#370
Conversation
Introduce three high-level concept guides under the Introduction group that orient new users around the core surface area: creating browsers, driving them (computer controls vs playwright execution), and observing them (live view, replays, screenshots, logs).
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
There was a problem hiding this comment.
Stale comment
Risk assessment: Very Low
The current diff is limited to documentation and navigation: new
introduction/create,introduction/control, andintroduction/observeMDX pages, a smalldocs.jsonsidebar update, and wording clarifications inbrowsers/standby.mdxandbrowsers/termination.mdx. I found no production codepaths, infrastructure, auth/security, data, prompt/instruction files, or shared runtime behavior changes.No
CODEOWNERSfile was present in the repo. The PR is already approved at the current head commit, so I’m not submitting another approval.Sent by Cursor Automation: Assign PR reviewers
There was a problem hiding this comment.
Stale comment
Risk assessment: Very Low
The current diff is limited to documentation and Mintlify navigation: three new
introduction/*MDX guides, a redirect/sidebar update indocs.json, link retargeting from the removedbrowsers/create-a-browser.mdxpage, and small wording clarifications for standby/termination behavior. The.docs/code-example-guide.mdchange only updates a reference to the moved guide path.I found no production codepaths, shared runtime behavior, infrastructure, auth/security, data/schema changes, or non-trivial prompt/instruction changes. No
CODEOWNERSfile was present. GitHub already reports this PR as approved, so I’m not submitting another approval.Sent by Cursor Automation: Assign PR reviewers
|
Firetiger deploy monitoring skipped This PR didn't match the auto-monitor filter configured on your GitHub connection:
Reason: PR only adds documentation guides and updates docs.json; no changes to API endpoints or Temporal workflows. To monitor this PR anyway, reply with |
There was a problem hiding this comment.
Stale comment
Risk assessment: Very Low
The current diff is limited to Mintlify documentation changes: new
introduction/create,introduction/control, andintroduction/observeguides, removal of the oldbrowsers/create-a-browserpage with a redirect tointroduction/create, navigation updates indocs.json, link retargeting, and small wording clarifications for standby/termination behavior.I found no production codepaths, infrastructure, auth/security, data/schema changes, or shared runtime behavior changes. The only
.docschange updates a reference path in the code example guide, which is a trivial documentation-maintenance edit. NoCODEOWNERSfile was present.GitHub already reports this PR as approved, so I’m leaving this assessment as a comment and not submitting another approval.
Sent by Cursor Automation: Assign PR reviewers
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4e62b81. Configure here.
|
|
||
| <Tabs> | ||
| <Tab title="Computer Controls"> | ||
| OS-level mouse, keyboard, and screen — the primitives a computer-use model already speaks (screenshot, click, type, key, scroll, drag). No CDP or WebDriver connection required, so there's no protocol fingerprint to leak. Ideal for [Claude](/integrations/computer-use/anthropic), [OpenAI](/integrations/computer-use/openai), or [Gemini](/integrations/computer-use/gemini) computer-use loops. |
There was a problem hiding this comment.
Third-party AI product names used instead of generic terms
Low Severity
The new introduction/control.mdx page names specific third-party AI products — "Claude", "OpenAI", "Gemini", and "Anthropic" — directly in the documentation text. The project rule requires vendor-neutral, generic terms (e.g. "computer-use models" or "your preferred model provider") instead of specific product names. This appears on line 10 ("Ideal for [Claude]… [OpenAI]… or [Gemini] computer-use loops") and line 163 ("drop-in examples for Anthropic, Gemini, OpenAI, and more").
Additional Locations (1)
Triggered by learned rule: Use generic terms instead of specific third-party AI product names
Reviewed by Cursor Bugbot for commit 4e62b81. Configure here.
There was a problem hiding this comment.
Stale comment
Risk assessment: Very Low
I re-evaluated the current head diff (
c5fd931). The changes remain limited to Mintlify documentation and navigation: newintroduction/create,introduction/control, andintroduction/observeguides, removal of the oldbrowsers/create-a-browserpage with a redirect, link retargeting, and small wording clarifications around standby/termination and “computer use.”I found no production codepaths, infrastructure, auth/security, data/schema changes, or shared runtime behavior changes. The
.docs/code-example-guide.mdedit is only a reference-path update. I also found noCODEOWNERSfile in the repo.GitHub already reports this PR as approved, and this update does not increase the risk, so I’m leaving this as a comment and not submitting another approval.
Sent by Cursor Automation: Assign PR reviewers
| description: "Spin up a cloud browser for your agent" | ||
| --- | ||
|
|
||
| Kernel browsers are sandboxed Chromium instances that boot in under 30ms. Your agent creates them on demand, drives them, and tears them down — no infra to provision, no warm pool to manage. |
There was a problem hiding this comment.
| Kernel browsers are sandboxed Chromium instances that boot in under 30ms. Your agent creates them on demand, drives them, and tears them down — no infra to provision, no warm pool to manage. | |
| Kernel browsers are sandboxed Chromium instances that boot in under 30ms. Your agent creates them on demand, drives them, and tears them down — no infra to provision, no servers to run. |
|
|
||
| <Columns cols={2}> | ||
| <Card title="Headless vs headful" href="/browsers/headless"> | ||
| Headful (the default) supports live view and replays. Headless is lighter (1 GB vs 8 GB) and faster — good for short-lived or highly concurrent jobs. |
There was a problem hiding this comment.
| Headful (the default) supports live view and replays. Headless is lighter (1 GB vs 8 GB) and faster — good for short-lived or highly concurrent jobs. | |
| headful (default) supports live view, replays, and better stealth — ideal for agent workflows on bot-detected sites. headless is lighter (1 gb vs 8 gb), good for simple scraping. |
|
|
||
| ## Lifecycle | ||
|
|
||
| A browser stays alive as long as something is driving it — a CDP or WebDriver client, a [Live View](/browsers/live-view) viewer, or an in-flight [computer controls](/browsers/computer-controls) request. After five seconds with none of those active, it enters [standby](/browsers/standby) — state is preserved, billing stops. After the configurable timeout (60s by default) it's deleted. |
There was a problem hiding this comment.
| A browser stays alive as long as something is driving it — a CDP or WebDriver client, a [Live View](/browsers/live-view) viewer, or an in-flight [computer controls](/browsers/computer-controls) request. After five seconds with none of those active, it enters [standby](/browsers/standby) — state is preserved, billing stops. After the configurable timeout (60s by default) it's deleted. | |
| A browser stays alive as long as something is driving it — a CDP or WebDriver client, a [Live View](/browsers/live-view) viewer, or an in-flight [computer controls](/browsers/computer-controls) request. After five seconds with none of those active, it enters [standby](/browsers/standby) — state is preserved, billing stops. Once in standby, after the configurable timeout (60s by default) elapses it's deleted. |
|
|
||
| A browser stays alive as long as something is driving it — a CDP or WebDriver client, a [Live View](/browsers/live-view) viewer, or an in-flight [computer controls](/browsers/computer-controls) request. After five seconds with none of those active, it enters [standby](/browsers/standby) — state is preserved, billing stops. After the configurable timeout (60s by default) it's deleted. | ||
|
|
||
| You can also delete a browser explicitly when you're done: |
There was a problem hiding this comment.
| You can also delete a browser explicitly when you're done: | |
| We recommend you delete a browser explicitly when you're done with it: |
| Putting it together — create, connect over CDP, do work, tear down: | ||
|
|
||
| <Info> | ||
| Kernel browsers launch with a default context and page. Make sure to access the existing context and page (`contexts()[0]` and `pages()[0]`) rather than creating a new one. |
There was a problem hiding this comment.
I feel like one of the goals with this PR is to also promote playwright execution API instead of a direct playwright connection. I think it's worth showing that example here instead of a direct playwright connection. Thoughts?
| description: "Drive the browser with computer use, CDP, or WebDriver BiDi" | ||
| --- | ||
|
|
||
| Kernel browsers expose three control primitives. For agents, we recommend [computer use](/browsers/computer-controls) — the primitives match how computer-use models were trained to drive a computer, and they sidestep the bot-detection surface that CDP introduces. |
There was a problem hiding this comment.
What do we think of having a forth tab for Playwright Execution API here as well in 2nd position?
|
|
||
| Kernel's computer controls are built to match how computer-use models were trained — the same primitives the model emits (screenshot, click at coords, type, key, scroll, drag) map 1:1 onto the API. There's no harness translating model output into framework calls. | ||
|
|
||
| - **Native fit.** Screenshot, click, type, key, scroll, drag — the primitives the model already speaks. Kernel uses these same controls in its own [managed auth](/auth/overview) agent. |
There was a problem hiding this comment.
| - **Native fit.** Screenshot, click, type, key, scroll, drag — the primitives the model already speaks. Kernel uses these same controls in its own [managed auth](/auth/overview) agent. | |
| - **Native fit.** Screenshot, click, type, key, scroll, drag — the primitives the model already speaks. |
|
|
||
| The two things computer controls don't do natively: read the DOM, and take a full-page screenshot. The recommended pattern for agents is **computer controls for interaction, [Playwright execution](/browsers/playwright-execution) as a DOM-reading tool** when the agent needs structured data. | ||
|
|
||
| Playwright execution runs arbitrary Playwright code in a fresh context inside the browser's VM. Your agent can call it as a tool whenever it needs structured DOM data or a full-page capture, then go right back to driving with computer controls. It ships with [Patchright](/browsers/bot-detection/stealth) by default, so DOM-side calls are hardened against bot detection too. |
There was a problem hiding this comment.
| Playwright execution runs arbitrary Playwright code in a fresh context inside the browser's VM. Your agent can call it as a tool whenever it needs structured DOM data or a full-page capture, then go right back to driving with computer controls. It ships with [Patchright](/browsers/bot-detection/stealth) by default, so DOM-side calls are hardened against bot detection too. | |
| Playwright execution runs arbitrary Playwright code in a fresh context inside the browser's VM. Your agent can call it as a tool whenever it needs structured DOM data, then go right back to driving with computer controls. |
|
|
||
| ## Computer use + Playwright execution | ||
|
|
||
| The two things computer controls don't do natively: read the DOM, and take a full-page screenshot. The recommended pattern for agents is **computer controls for interaction, [Playwright execution](/browsers/playwright-execution) as a DOM-reading tool** when the agent needs structured data. |
There was a problem hiding this comment.
| The two things computer controls don't do natively: read the DOM, and take a full-page screenshot. The recommended pattern for agents is **computer controls for interaction, [Playwright execution](/browsers/playwright-execution) as a DOM-reading tool** when the agent needs structured data. | |
| The one thing computer controls don't do natively: read the DOM. The recommended pattern for agents is computer controls for interaction, [Playwright execution](/browsers/playwright-execution) as a DOM-reading tool when the agent needs structured data. |
| @@ -0,0 +1,163 @@ | |||
| --- | |||
| title: "Control" | |||
| description: "Drive the browser with computer use, CDP, or WebDriver BiDi" | |||
There was a problem hiding this comment.
Do we feel that the two sections below this flow well enough with the three options presented? It's all only relevant for the computer use choice, but if you flip to cdp or bidi up top, it no longer is relevant.
| ``` | ||
| </CodeGroup> | ||
|
|
||
| Add `?readOnly=true` for a non-interactive view, or enable [kiosk mode](/browsers/live-view#kiosk-mode) at creation for a fullscreen, chromeless experience. Full reference: [Live View](/browsers/live-view). |
There was a problem hiding this comment.
| Add `?readOnly=true` for a non-interactive view, or enable [kiosk mode](/browsers/live-view#kiosk-mode) at creation for a fullscreen, chromeless experience. Full reference: [Live View](/browsers/live-view). | |
| Add `?readOnly=true` for a non-interactive view, or enable [kiosk mode](/browsers/live-view#kiosk-mode) at creation for a fullscreen, cinematic experience. Full reference: [Live View](/browsers/live-view). |
|
|
||
| ## Replays | ||
|
|
||
| Replays are MP4 recordings you start and stop on demand. They're the right tool for post-hoc debugging: a failed run produces a video you can scrub through, share, or attach to a bug report. |
There was a problem hiding this comment.
| Replays are MP4 recordings you start and stop on demand. They're the right tool for post-hoc debugging: a failed run produces a video you can scrub through, share, or attach to a bug report. | |
| Replays are MP4 recordings you start and stop on demand — capture as many clips per session as you need. They're the right tool for post-hoc debugging: a failed run gives you one or more videos to scrub through, share, or attach to a bug report. |
| ## Replays | ||
|
|
||
| Replays are MP4 recordings you start and stop on demand. They're the right tool for post-hoc debugging: a failed run produces a video you can scrub through, share, or attach to a bug report. | ||
|
|
There was a problem hiding this comment.
| Replays can also be enabled on managed auth sessions, so you can [debug failed logins](https://www.kernel.sh/docs/auth/configuration#record-sessions-for-debugging) the same way. | |
dprevoznik
left a comment
There was a problem hiding this comment.
Left some suggestions and a few open-ended questions
There was a problem hiding this comment.
Risk assessment: Very Low
I re-evaluated the current diff at head 296de55. The changes are limited to Mintlify documentation and navigation: new introduction/create, introduction/control, and introduction/observe guides, removal of browsers/create-a-browser with a redirect to introduction/create, sidebar/card/link retargeting, and small wording updates around standby and termination behavior.
I found no production codepaths, infrastructure, auth/security, data/schema changes, or shared runtime behavior changes. The .docs/code-example-guide.md edit is a trivial reference-path update, not a substantive prompt/instruction change. I also found no CODEOWNERS file in the repo.
GitHub already reports this PR as approved, and the new diff does not increase risk, so I’m leaving this as a comment and not submitting another approval.
Sent by Cursor Automation: Assign PR reviewers
Add a Playwright Execution tab on Control (position 2, between Computer Use and CDP) and a parallel "Why playwright execution over a direct CDP connection" section. Swap the Create full example to use kernel.browsers.playwright.execute so the recommended path is the one readers see first. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Risk assessment: Very Low
I re-evaluated the current diff at head 81d0027. The changes remain limited to Mintlify documentation and navigation: three new introduction/* guides, removal of the old browsers/create-a-browser page with a redirect, sidebar/card/link retargeting, and small wording updates around standby, termination, and browser control options.
I found no production codepaths, infrastructure, auth/security, data/schema changes, or shared runtime behavior changes. The .docs/code-example-guide.md edit is only a reference-path update, not a substantive prompt/instruction change. I also found no CODEOWNERS file in the repo.
GitHub already reports this PR as approved, and the latest update does not increase risk, so I’m leaving this as a comment and not submitting another approval.
Sent by Cursor Automation: Assign PR reviewers




Summary
Adds three high-level concept guides under the Introduction group to orient new users around Kernel's core surface area:
introduction/create.mdx) — spinning up a browser, picking the right shape (headless, stealth, GPU, profiles), and lifecycle.introduction/control.mdx) — the two ways to drive a browser: computer controls (model-native, vision loops) vs Playwright execution (DOM, full-page screenshots). Includes a when-to-use-which table and the recommended pattern of using both together.introduction/observe.mdx) — live view, replays, screenshots, and invocation logs, framed by what each is good for.docs.jsonis updated to rename the first nav group fromhometoIntroductionand add the three new pages alongsideindex.Preview
Once Mintlify builds:
https://tbd-6fc993ce-hypeship-intro-create-control-observe.mintlify.app/introduction/create(and/control,/observe).Test plan
mintlify devrenders the three new pagesNote
Low Risk
Low risk documentation-only change that reorganizes onboarding content and updates internal links/redirects; primary risk is broken links or navigation regressions if any references were missed.
Overview
Adds a new Introduction docs section with three concept guides:
introduction/create,introduction/control, andintroduction/observe, and updates the landing page to point users to these new starting points.Reorganizes navigation in
docs.json, removes the legacybrowsers/create-a-browser.mdxpage, and adds a redirect from/browsers/create-a-browserto/introduction/create. Updates cross-doc links (auth FAQ, integrations, scaling/pools, changelog) and clarifies browser lifecycle wording around standby/timeout semantics inbrowsers/standbyandbrowsers/termination.Reviewed by Cursor Bugbot for commit 81d0027. Bugbot is set up for automated code reviews on this repo. Configure here.