WebMCP explainer

WebMCP lets web pages expose JavaScript tools to agents inside the page itself.

It is the page-level capability model for shared-context, human-in-the-loop workflows — not just another browser automation trick.

In the current proposal, a loaded page can register tools with structured schemas and JavaScript execute handlers, giving agents a page-owned capability surface that augments page content and actuation instead of blindly guessing from the DOM.

This page is grounded in the current official WebMCP README and proposal.md from the webmachinelearning/webmcp repository; ecosystem demos are included only as supporting examples, not as the normative definition.

Client-side
Where tools execute
Tool logic runs in the web page, not only on a backend server.
Shared UI
What user and agent share
The same visible UI, page state, auth context, and product logic.
Human loop
Workflow posture
Designed for collaborative, human-in-the-loop web experiences.
What WebMCP is

Think of WebMCP as websites acting like tool providers from inside the browser context.

The proposal lets a page expose capability-shaped JavaScript functions to agents, browser assistants, or assistive tools. In API terms, the page registers tools through navigator.modelContext and runs them against live page state instead of a detached backend.

Tool model

Pages expose tools, not just clickable controls.

A page can register tools with a name, natural-language description, and structured input schema so the agent sees a stable capability contract instead of reverse-engineering the DOM.

Execution model

The execution happens in the page context.

Execution happens through page-owned JavaScript, so the tool can reuse front-end logic, current page state, and the user’s active session without forcing everything into a separate server integration.

Workflow model

Users and agents stay in the same visible workspace.

Users and agents stay in the same visible workspace. The human web interface remains primary, while tools augment it with faster, more structured actions.

Important boundaryWebMCP works with backend MCP and other protocols; it is not a replacement for backend integrations, and it is not trying to replace the human web interface.
What the proposal actually defines

The current proposal is concrete about runtime shape, API surface, and its present-day limits.

WebMCP is not only “pages can expose tools.” The proposal names the browsing context, registration API, tool contract, user-interaction path, and several explicit non-goals.

Browsing context

Tools exist only while a top-level page is loaded.

A single top-level browsing context, such as a browser tab, is the model context provider. Agents do not get these tools from a detached background server; the page has to be open.

  • Current proposal text does not promise a headless execution mode.
  • Tools are generally discoverable only after navigation and persist only for the lifetime of the loaded page.
Tool shape

Tool registration has a concrete browser API.

The page uses navigator.modelContext.registerTool(...) to define the capability and its schema in page JavaScript.

  • The imperative shape is registerTool with name, description, inputSchema, and execute.
  • unregisterTool(name) makes tool availability page-scoped and revocable.
Execution semantics

Execution stays in page script, with UI responsibility on the app.

The execute callback runs against live page state. Simple pages can stay in main-thread script; heavier work can delegate to workers.

  • Tool calls are handled sequentially in page context by design.
  • Page content and actuation still exist, so tools must keep the visible UI in sync with state changes.
Consent and collaboration

Consent is part of the API surface, not an afterthought.

During execute(..., agent), the page can ask for user input through agent.requestUserInteraction(...), while the browser arbitrates access to the tool.

  • This is why the proposal is aimed at supervised, human-in-the-loop workflows.
  • It is not pitched as a replacement for backend MCP or for fully autonomous agents.
Current limitations and unresolved edges
No built-in headless mode: a visible browsing context is still required today.
Tool discoverability is unresolved: clients generally need to visit the page first, though manifest-like futures are discussed.
Developers still own UI synchronization and may need refactoring to expose clean, reusable tool boundaries.
API shape

What WebMCP code actually looks like in a real page.

The proposal is intentionally lightweight: the page registers tools, the browser arbitrates access, and the execute handler runs against live page state. The example below is pseudo-code distilled from the official proposal so teams can see the real engineering boundary.

Page-side registration

A page uses navigator.modelContext.registerTool(...) to publish a callable capability with name, description, input schema, and execute handler.

if ('modelContext' in window.navigator) {
  window.navigator.modelContext.registerTool({
    name: "buy-product",
    description: "Purchase the current product",
    inputSchema: productPurchaseSchema,
    async execute(args, agent) {
      return await buyProduct(args, agent)
    }
  })
}
Execution-time user interaction

During execute(..., agent), the page may call agent.requestUserInteraction(...) when confirmation or user input is required before continuing.

async function buyProduct({ productId }, agent) {
  const confirmed = await agent.requestUserInteraction(
    () => openConfirmDialog(productId)
  )

  if (!confirmed) throw new Error('Purchase cancelled')

  await executePurchase(productId)
  syncCheckoutUI()
  return { ok: true }
}
Why this matters in practice
  • The tool is page-scoped: no loaded page, no tool.
  • The tool contract must be explicit enough for an agent to choose it without DOM guesswork.
  • The execute path should reuse page logic and then update visible UI so the human can review the result.
What it is not

You only really understand WebMCP once the boundaries are clear.

The useful distinction is not “AI can use a website.” The real question is whether the agent acts through raw UI actuation, a backend integration, or page-owned tools.

Matrix view
Each row shows where WebMCP sits against another integration model.
WebMCP side
Page-defined tools in shared UI context
Comparison side
Alternative integration posture for the same job
Against DOM clicking01

WebMCP vs browser automation

Both operate around a live page, but WebMCP raises the abstraction from selectors and gestures to page-defined tools — while still allowing automation fallback when the page exposes too little.

Use WebMCP when you want the page to define safe, explicit capabilities. Keep browser automation as fallback when the task is not covered by the page’s tools.
WebMCP
  • Calls capability-shaped tools instead of replaying low-level UI steps.
  • Can reuse existing page logic and structured arguments.
  • Does not conflict with automation fallback; it adds a higher-level path when the page intentionally exposes one.
Browser automation
  • Simulates clicks, typing, scrolling, and selector targeting.
  • Breaks more easily when UI layout or interaction timing changes.
  • Often has to infer business intent from visible controls alone.
Against server-side integration02

WebMCP vs backend MCP

Both can expose tools to agents, but WebMCP is page-scoped and browser-mediated, while backend MCP is service-side and always-on.

Use WebMCP when the page itself is the working surface; use backend MCP when the capability should exist without a loaded page or browser UI.
WebMCP
  • Runs in page context with shared state, visible progress, and user oversight.
  • Fits collaborative UI workflows and client-heavy products.
  • Keeps the browser experience primary instead of treating UI as an afterthought.
Backend MCP
  • Runs service-to-service without needing a live page UI.
  • Fits headless, fully delegated, or server-owned operations well.
  • Does not inherently share the page’s live interface and interaction context.
Concept vs delivery layer03

WebMCP vs FastWebMCP

WebMCP explains the capability model. FastWebMCP addresses the engineering problem of making a real product deliver that model safely and repeatedly.

WebMCP tells you what kind of interface the page should expose. FastWebMCP helps you actually ship that interface inside a real product.
WebMCP itself
  • Defines the page-level idea: tools exposed from web apps.
  • Clarifies where page context is valuable and why human-in-the-loop matters.
  • Is a protocol / platform direction, not a full product rollout system.
FastWebMCP
  • Adds runtime, bridges, adapters, route mapping, telemetry, and release packaging.
  • Helps teams retrofit existing products without rebuilding them from scratch.
  • Turns the concept into a CI-safe, multi-route delivery path.
Structural view

Three models, three different control surfaces.

The fastest way to understand WebMCP is to compare where the agent sits, where logic runs, and who owns the visible workflow.

WebMCP

webmcp

Agent works with the live page through page-owned tools.

UserHuman goalAgentReason + choosebest execution pathLive pageRuns in visible UIintenttool callshared UI state
  • Shared UI context
  • Client-side tool execution
  • Visible collaborative workflow

Browser automation

automation

Agent drives the page by imitating a human at the DOM layer.

UserHuman goalAgentReason + choosebest execution pathLive pageRuns in visible UIintentDOM actuationUI observed
  • Selector / timing dependence
  • Low-level UI actuation
  • Business intent inferred indirectly

Backend MCP

backend

Agent talks directly to the service backend without needing the live page as the execution surface.

UserHuman goalAgentReason + choosebest execution pathBackend serviceRuns on service sideintentserver integrationAPI response
  • Server-to-server integration
  • Good for headless workflows
  • No inherent shared page context
How it works

A WebMCP-style flow is conceptually simple: the page exposes a capability, the agent calls it in context.

The current proposal already sketches the interaction model clearly: page-scoped registration, structured invocation, page-context execution, and browser-mediated user interaction when needed.

Swimlane diagram
Read each lane top to bottom: the user stays in the visible web app, the agent chooses among page tools, and the page executes in live state with browser-mediated access.
Page-owned tool boundary
Shared UI + state
Human-in-the-loop visibility
Step
User
Agent
Page
  1. 01
    Expose a tool

    The page calls navigator.modelContext.registerTool(...) with a description, input schema, and execute handler.

    Starts with a goal inside the live product UI.
    Sees the registered tool as a first-class action.
    Declares a tool boundary with description and schema.
  2. 02
    Agent discovers it

    The agent sees the tool after the page is loaded; page content and actuation remain available alongside the tool.

    Stays in the same visible workspace and can supervise.
    Chooses the tool instead of guessing from raw controls.
    Provides current state, auth context, and route-local logic.
  3. 03
    Invoke in page context

    The execute callback runs against live page state and can call agent.requestUserInteraction(...) when human confirmation or input is required.

    Can observe progress and adjust intent if needed.
    Invokes the capability with structured arguments.
    Executes using existing page logic and current session state.
  4. 04
    Share results in UI

    The page updates visible UI so the human interface stays primary and the user can review, steer, or take back over.

    Reviews the resulting interface state and next options.
    Continues with updated context instead of re-parsing everything.
    Reflects the result directly in the live UI.
Where it fits

WebMCP is strongest when the web page is part of the job, not just a UI shell around an API.

The deciding factor is whether shared page context and page-owned logic materially improve the workflow.

Best fit

  • Rich web apps where the user and agent should collaborate inside the same visible interface.
  • Client-heavy products that already contain important front-end business logic or interaction state.
  • Flows where page-level capability boundaries are safer and more stable than raw DOM actuation.

Not the first choice

  • Purely headless service-to-service operations with no meaningful page context.
  • Fully autonomous background workflows where no browser UI or human oversight is needed.
  • Cases that are already cleanly solved by direct backend MCP or conventional API integrations.
Developer obligations

WebMCP does not remove implementation work; it changes where that work should be invested.

The proposal lowers the cost of exposing page capabilities, but teams still have to do product and frontend engineering work. If these obligations are ignored, the result becomes a demo instead of a reliable product surface.

Refactor page logic into reusable capability boundaries

If the page only works through click handlers and local component state, developers usually need to extract stable functions before a tool can be exposed cleanly.

  • Good WebMCP tools usually map to existing product actions, not random UI fragments.
  • FastWebMCP becomes useful here because it gives those extracted capabilities a repeatable bridge and rollout layer.

Keep visible UI synchronized with tool-driven state changes

The proposal is explicit that tool calls run in page JavaScript and the page should reflect resulting state changes back into the UI.

  • If the agent changes app state but the UI does not reflect it, trust collapses immediately.
  • This is especially important for review, undo, approval, and multi-step flows.

Design explicit consent and interruption paths

Browser-mediated access and requestUserInteraction(...) give the page a structured way to pause for user confirmation, auth, or missing input.

  • Treat confirmation UX as product design, not just a technical callback.
  • High-risk actions should stay inspectable and reversible from the visible interface.

Choose WebMCP only where page context truly adds value

Not every capability belongs in a page-scoped tool. Some work is still better served by backend MCP, direct APIs, or ordinary browser automation fallback.

  • Use WebMCP when shared UI, client-side logic, or human oversight materially improves the workflow.
  • Avoid forcing page tools onto jobs that should exist without a loaded browser surface.
Implementation takeaway

A good beta-quality WebMCP product is not just “page has tools.” It has extracted capability boundaries, synchronized UI, consent patterns, and a clear line between page tools, backend integrations, and automation fallback.

Where FastWebMCP fits

FastWebMCP is the engineering layer that helps teams turn the WebMCP idea into a shippable product surface.

If WebMCP explains what the page-agent interface should look like, FastWebMCP focuses on how real teams retrofit routes, model execution modes, package adapters, and ship through CI/CD.

Route-aware runtime

Map capability rollout to real pages and product surfaces instead of scattering custom scripts.

Bridge and adapter system

Choose markup, callback, or endpoint-style execution depending on how the product actually behaves.

Delivery and release posture

Package, test, publish, and consume the integration through repeatable deployment workflows.

FastWebMCP
From concept to rollout surface

FastWebMCP takes the WebMCP-style idea of page-level tools and turns it into an engineering delivery path: route-aware runtime, bridge modes, reusable adapters, release artifacts, and CI-safe package consumption.

References

Reference documents behind this page

These links are the normative or supporting materials this explainer is anchored to. The first two are the most important because they define the proposal’s current language, API shape, goals, and non-goals.

README
August 13, 2025

webmachinelearning/webmcp README

High-level explainer covering motivation, goals, non-goals, and why WebMCP is about human-in-the-loop, page-context workflows.

Authors
Brandon Walderman, Leo Lee, Andrew Nolan, David Bokan, Khushal Sagar, Hannah Van Opstal
Date
August 13, 2025
Why it is cited here
Use this as the high-level source for goals, non-goals, human-in-the-loop positioning, and the distinction between page tools, actuation, and backend integrations.
Open reference
proposal.md
August 13, 2025

WebMCP API proposal (proposal.md)

Primary API reference for modelContext, registerTool/unregisterTool, sequential page-side execution, requestUserInteraction, and current limitations.

Authors
Brandon Walderman, Andrew Nolan, David Bokan, Khushal Sagar, Hannah Van Opstal
Date
August 13, 2025
Why it is cited here
Use this as the primary source for API shape: modelContext, registerTool/unregisterTool, requestUserInteraction, sequential page-side execution, and explicit limitations.
Open reference
MCP
Living documentation

Model Context Protocol introduction

Useful background for understanding what WebMCP is not replacing: backend MCP remains the right fit for server-side and headless integrations.

Authors
Model Context Protocol project
Date
Living documentation
Why it is cited here
Use this as supporting context for what backend MCP is good at, so readers understand that WebMCP complements rather than replaces service-side MCP.
Open reference

If the proposal evolves, this page should be updated against those upstream documents rather than drift into marketing copy.

What is WebMCP? — page-level tools, shared context, and where FastWebMCP fits