WebMCP explainer

WebMCP lets web pages expose JavaScript tools to agents inside the page itself.

It is the page-level capability model for shared-context, human-in-the-loop workflows — not just another browser automation trick.

In the current proposal, a loaded page can register tools with structured schemas and JavaScript execute handlers, giving agents a page-owned capability surface that augments page content and actuation instead of blindly guessing from the DOM.

This page is grounded in the current official WebMCP README and proposal.md from the webmachinelearning/webmcp repository; ecosystem demos are included only as supporting examples, not as the normative definition.

See the comparisons See FastWebMCP on the homepage

Client-side

Where tools execute

Tool logic runs in the web page, not only on a backend server.

Shared UI

What user and agent share

The same visible UI, page state, auth context, and product logic.

Human loop

Workflow posture

Designed for collaborative, human-in-the-loop web experiences.

What WebMCP is

Think of WebMCP as websites acting like tool providers from inside the browser context.

The proposal lets a page expose capability-shaped JavaScript functions to agents, browser assistants, or assistive tools. In API terms, the page registers tools through navigator.modelContext and runs them against live page state instead of a detached backend.

Tool model

Pages expose tools, not just clickable controls.

A page can register tools with a name, natural-language description, and structured input schema so the agent sees a stable capability contract instead of reverse-engineering the DOM.

Execution model

The execution happens in the page context.

Execution happens through page-owned JavaScript, so the tool can reuse front-end logic, current page state, and the user’s active session without forcing everything into a separate server integration.

Workflow model

Users and agents stay in the same visible workspace.

Users and agents stay in the same visible workspace. The human web interface remains primary, while tools augment it with faster, more structured actions.

Important boundaryWebMCP works with backend MCP and other protocols; it is not a replacement for backend integrations, and it is not trying to replace the human web interface.

What the proposal actually defines

The current proposal is concrete about runtime shape, API surface, and its present-day limits.

WebMCP is not only “pages can expose tools.” The proposal names the browsing context, registration API, tool contract, user-interaction path, and several explicit non-goals.

Browsing context

Tools exist only while a top-level page is loaded.

A single top-level browsing context, such as a browser tab, is the model context provider. Agents do not get these tools from a detached background server; the page has to be open.

Current proposal text does not promise a headless execution mode.
Tools are generally discoverable only after navigation and persist only for the lifetime of the loaded page.

Tool shape

Tool registration has a concrete browser API.

The page uses navigator.modelContext.registerTool(...) to define the capability and its schema in page JavaScript.

The imperative shape is registerTool with name, description, inputSchema, and execute.
unregisterTool(name) makes tool availability page-scoped and revocable.

Execution semantics

Execution stays in page script, with UI responsibility on the app.

The execute callback runs against live page state. Simple pages can stay in main-thread script; heavier work can delegate to workers.

Tool calls are handled sequentially in page context by design.
Page content and actuation still exist, so tools must keep the visible UI in sync with state changes.

Consent and collaboration

Consent is part of the API surface, not an afterthought.

During execute(..., agent), the page can ask for user input through agent.requestUserInteraction(...), while the browser arbitrates access to the tool.

This is why the proposal is aimed at supervised, human-in-the-loop workflows.
It is not pitched as a replacement for backend MCP or for fully autonomous agents.

Current limitations and unresolved edges

No built-in headless mode: a visible browsing context is still required today.

Tool discoverability is unresolved: clients generally need to visit the page first, though manifest-like futures are discussed.

Developers still own UI synchronization and may need refactoring to expose clean, reusable tool boundaries.

API shape

What WebMCP code actually looks like in a real page.

The proposal is intentionally lightweight: the page registers tools, the browser arbitrates access, and the execute handler runs against live page state. The example below is pseudo-code distilled from the official proposal so teams can see the real engineering boundary.

Page-side registration

A page uses navigator.modelContext.registerTool(...) to publish a callable capability with name, description, input schema, and execute handler.

if ('modelContext' in window.navigator) {
  window.navigator.modelContext.registerTool({
    name: "buy-product",
    description: "Purchase the current product",
    inputSchema: productPurchaseSchema,
    async execute(args, agent) {
      return await buyProduct(args, agent)
    }
  })
}

Execution-time user interaction

During execute(..., agent), the page may call agent.requestUserInteraction(...) when confirmation or user input is required before continuing.

async function buyProduct({ productId }, agent) {
  const confirmed = await agent.requestUserInteraction(
    () => openConfirmDialog(productId)
  )

  if (!confirmed) throw new Error('Purchase cancelled')

  await executePurchase(productId)
  syncCheckoutUI()
  return { ok: true }
}

Why this matters in practice

The tool is page-scoped: no loaded page, no tool.
The tool contract must be explicit enough for an agent to choose it without DOM guesswork.
The execute path should reuse page logic and then update visible UI so the human can review the result.

What it is not

You only really understand WebMCP once the boundaries are clear.

The useful distinction is not “AI can use a website.” The real question is whether the agent acts through raw UI actuation, a backend integration, or page-owned tools.

Matrix view

Each row shows where WebMCP sits against another integration model.

WebMCP side

Page-defined tools in shared UI context

Comparison side

Alternative integration posture for the same job

Against DOM clicking01

WebMCP vs browser automation

Both operate around a live page, but WebMCP raises the abstraction from selectors and gestures to page-defined tools — while still allowing automation fallback when the page exposes too little.

Use WebMCP when you want the page to define safe, explicit capabilities. Keep browser automation as fallback when the task is not covered by the page’s tools.

WebMCP

Calls capability-shaped tools instead of replaying low-level UI steps.
Can reuse existing page logic and structured arguments.
Does not conflict with automation fallback; it adds a higher-level path when the page intentionally exposes one.

Browser automation

Simulates clicks, typing, scrolling, and selector targeting.
Breaks more easily when UI layout or interaction timing changes.
Often has to infer business intent from visible controls alone.

Against server-side integration02

WebMCP vs backend MCP

Both can expose tools to agents, but WebMCP is page-scoped and browser-mediated, while backend MCP is service-side and always-on.

Use WebMCP when the page itself is the working surface; use backend MCP when the capability should exist without a loaded page or browser UI.

WebMCP

Runs in page context with shared state, visible progress, and user oversight.
Fits collaborative UI workflows and client-heavy products.
Keeps the browser experience primary instead of treating UI as an afterthought.

Backend MCP

Runs service-to-service without needing a live page UI.
Fits headless, fully delegated, or server-owned operations well.
Does not inherently share the page’s live interface and interaction context.

Concept vs delivery layer03

WebMCP vs FastWebMCP

WebMCP explains the capability model. FastWebMCP addresses the engineering problem of making a real product deliver that model safely and repeatedly.

WebMCP tells you what kind of interface the page should expose. FastWebMCP helps you actually ship that interface inside a real product.

WebMCP itself

Defines the page-level idea: tools exposed from web apps.
Clarifies where page context is valuable and why human-in-the-loop matters.
Is a protocol / platform direction, not a full product rollout system.

FastWebMCP

Adds runtime, bridges, adapters, route mapping, telemetry, and release packaging.
Helps teams retrofit existing products without rebuilding them from scratch.
Turns the concept into a CI-safe, multi-route delivery path.

Structural view

Three models, three different control surfaces.

The fastest way to understand WebMCP is to compare where the agent sits, where logic runs, and who owns the visible workflow.

WebMCP

webmcp

Agent works with the live page through page-owned tools.

Shared UI context
Client-side tool execution
Visible collaborative workflow

Browser automation

automation

Agent drives the page by imitating a human at the DOM layer.

Selector / timing dependence
Low-level UI actuation
Business intent inferred indirectly

Backend MCP

backend

Agent talks directly to the service backend without needing the live page as the execution surface.

Server-to-server integration
Good for headless workflows
No inherent shared page context

How it works

A WebMCP-style flow is conceptually simple: the page exposes a capability, the agent calls it in context.

The current proposal already sketches the interaction model clearly: page-scoped registration, structured invocation, page-context execution, and browser-mediated user interaction when needed.

Swimlane diagram

Read each lane top to bottom: the user stays in the visible web app, the agent chooses among page tools, and the page executes in live state with browser-mediated access.

Page-owned tool boundary

Shared UI + state

Human-in-the-loop visibility

Step

User

Agent

Page

01
Expose a tool
The page calls navigator.modelContext.registerTool(...) with a description, input schema, and execute handler.
Starts with a goal inside the live product UI.
Sees the registered tool as a first-class action.
Declares a tool boundary with description and schema.
02
Agent discovers it
The agent sees the tool after the page is loaded; page content and actuation remain available alongside the tool.
Stays in the same visible workspace and can supervise.
Chooses the tool instead of guessing from raw controls.
Provides current state, auth context, and route-local logic.
03
Invoke in page context
The execute callback runs against live page state and can call agent.requestUserInteraction(...) when human confirmation or input is required.
Can observe progress and adjust intent if needed.
Invokes the capability with structured arguments.
Executes using existing page logic and current session state.
04
Share results in UI
The page updates visible UI so the human interface stays primary and the user can review, steer, or take back over.
Reviews the resulting interface state and next options.
Continues with updated context instead of re-parsing everything.
Reflects the result directly in the live UI.

Where it fits

WebMCP is strongest when the web page is part of the job, not just a UI shell around an API.

The deciding factor is whether shared page context and page-owned logic materially improve the workflow.

Best fit

Rich web apps where the user and agent should collaborate inside the same visible interface.
Client-heavy products that already contain important front-end business logic or interaction state.
Flows where page-level capability boundaries are safer and more stable than raw DOM actuation.

Not the first choice

Purely headless service-to-service operations with no meaningful page context.
Fully autonomous background workflows where no browser UI or human oversight is needed.
Cases that are already cleanly solved by direct backend MCP or conventional API integrations.

Developer obligations

WebMCP does not remove implementation work; it changes where that work should be invested.

The proposal lowers the cost of exposing page capabilities, but teams still have to do product and frontend engineering work. If these obligations are ignored, the result becomes a demo instead of a reliable product surface.

Refactor page logic into reusable capability boundaries

If the page only works through click handlers and local component state, developers usually need to extract stable functions before a tool can be exposed cleanly.

Good WebMCP tools usually map to existing product actions, not random UI fragments.
FastWebMCP becomes useful here because it gives those extracted capabilities a repeatable bridge and rollout layer.

Keep visible UI synchronized with tool-driven state changes

The proposal is explicit that tool calls run in page JavaScript and the page should reflect resulting state changes back into the UI.

If the agent changes app state but the UI does not reflect it, trust collapses immediately.
This is especially important for review, undo, approval, and multi-step flows.

Design explicit consent and interruption paths

Browser-mediated access and requestUserInteraction(...) give the page a structured way to pause for user confirmation, auth, or missing input.

Treat confirmation UX as product design, not just a technical callback.
High-risk actions should stay inspectable and reversible from the visible interface.

Choose WebMCP only where page context truly adds value

Not every capability belongs in a page-scoped tool. Some work is still better served by backend MCP, direct APIs, or ordinary browser automation fallback.

Use WebMCP when shared UI, client-side logic, or human oversight materially improves the workflow.
Avoid forcing page tools onto jobs that should exist without a loaded browser surface.

Implementation takeaway

A good beta-quality WebMCP product is not just “page has tools.” It has extracted capability boundaries, synchronized UI, consent patterns, and a clear line between page tools, backend integrations, and automation fallback.

Where FastWebMCP fits

FastWebMCP is the engineering layer that helps teams turn the WebMCP idea into a shippable product surface.

If WebMCP explains what the page-agent interface should look like, FastWebMCP focuses on how real teams retrofit routes, model execution modes, package adapters, and ship through CI/CD.

Route-aware runtime

Map capability rollout to real pages and product surfaces instead of scattering custom scripts.

Bridge and adapter system

Choose markup, callback, or endpoint-style execution depending on how the product actually behaves.

Delivery and release posture

Package, test, publish, and consume the integration through repeatable deployment workflows.

FastWebMCP

From concept to rollout surface

FastWebMCP takes the WebMCP-style idea of page-level tools and turns it into an engineering delivery path: route-aware runtime, bridge modes, reusable adapters, release artifacts, and CI-safe package consumption.

Back to FastWebMCP homepage Go to the SDK section

References

Reference documents behind this page

These links are the normative or supporting materials this explainer is anchored to. The first two are the most important because they define the proposal’s current language, API shape, goals, and non-goals.

README

August 13, 2025

webmachinelearning/webmcp README

High-level explainer covering motivation, goals, non-goals, and why WebMCP is about human-in-the-loop, page-context workflows.

Authors: Brandon Walderman, Leo Lee, Andrew Nolan, David Bokan, Khushal Sagar, Hannah Van Opstal
Date: August 13, 2025
Why it is cited here: Use this as the high-level source for goals, non-goals, human-in-the-loop positioning, and the distinction between page tools, actuation, and backend integrations.

Open reference

proposal.md

August 13, 2025

WebMCP API proposal (proposal.md)

Primary API reference for modelContext, registerTool/unregisterTool, sequential page-side execution, requestUserInteraction, and current limitations.

Authors: Brandon Walderman, Andrew Nolan, David Bokan, Khushal Sagar, Hannah Van Opstal
Date: August 13, 2025
Why it is cited here: Use this as the primary source for API shape: modelContext, registerTool/unregisterTool, requestUserInteraction, sequential page-side execution, and explicit limitations.

Open reference

MCP

Living documentation

Model Context Protocol introduction

Useful background for understanding what WebMCP is not replacing: backend MCP remains the right fit for server-side and headless integrations.

Authors: Model Context Protocol project
Date: Living documentation
Why it is cited here: Use this as supporting context for what backend MCP is good at, so readers understand that WebMCP complements rather than replaces service-side MCP.

Open reference

If the proposal evolves, this page should be updated against those upstream documents rather than drift into marketing copy.