How MCP Apps Add UI to MCP
MCP standardizes how LLMs interact with tools and data, but it doesn't cover interactive UI. MCP Apps fill that gap.

Introduction to MCP

MCP, the Model Context Protocol, gives an LLM a standard way to work with external API's. An MCP server can expose three types of primitives:

  • tools: to allow the LLM to perform actions
  • resources: read-only content to provide context to the LLM
  • prompts: to expose reusable prompt templates

All of this focuses on providing the LLM with context: which tools it can use, what data it can read, even its behavior (persona or skills via prompts).

What's missing: a standardized way for a server to deliver an interactive UI to the user.

Why MCP Apps exists

As MCP does not define how to deliver an interactive UI to the client, the only option is to develop a custom implementation, which is fine when you are doing your own chatbot on your website, but will not work for users directly using a public LLM chatbot that tries to connect to your website.

MCP Apps is the standard to solve that. It lets an MCP server associate a tool with a UI resource, and it defines how the host should render that UI and communicate with it.

The important point: that UI is part of the MCP system and not just an unrelated webpage.

The basic model

MCP Apps introduces three main pieces:

  • UI resources, an MCP resource declared with the ui:// URI scheme.
  • metadata, that you add on a tool to link it to a UI resource
  • bidirectional communication between the host and the embedded UI using JSON-RPC

A UI resource is still an MCP resource, but with specific rules. In the current specification, the main content type is HTML with the MIME type text/html;profile=mcp-app.

At a high level, the flow is:

  1. An MCP server exposes a tool with _meta.ui.resourceUri pointing to a UI resource.
  2. The host discovers the tool via tools/list and reads the UI metadata. It may prefetch the UI resource at this point.
  3. The model calls the tool.
  4. The host executes the tool call and fetches the UI resource (if not already cached).
  5. The host tells its frontend to render the UI.
  6. The frontend renders the view in a sandboxed iframe.
  7. The iframe and the host exchange MCP-style JSON-RPC messages via postMessage. The host proxies tool calls and resource reads to the MCP server.

MCP Apps therefore does not replace tools but extends them through additional conventions.

Linking tools to UI and controlling visibility

Like said earlier, the connection between a tool and its UI lives in tool metadata, more precisely inside _meta.ui: field resourceUri identifies the UI resource to render.

But not all tools are meant for the LLM: some exist purely to serve the iframe. The field _meta.ui.visibility controls who can use a tool:

  • model means the tool is visible to and callable by the model
  • app means the tool is callable by the embedded app

If visibility is omitted, the default is both model and app.

Here is what that looks like in practice. The initiate_payment tool is visible to the model and points to a UI resource. The get_auth_challenge_options tool is app-only — the model never sees it, but the embedded view can call it directly:

// Model-visible tool — triggers the UI
{
  "name": "initiate_payment",
  "description": "Initiate a payment with the specified payment method.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "paymentMethodId": { "type": "integer" }
    },
    "required": ["paymentMethodId"]
  },
  "_meta": {
    "ui": {
      "resourceUri": "ui://payments/payment-view",
      "visibility": ["model"]
    }
  }
}

// App-only tool — callable by the iframe, hidden from the model
{
  "name": "get_auth_challenge_options",
  "description": "Called by the payment view to retrieve authentication options.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "secureToken": { "type": "string" },
      "browserData": {}
    },
    "required": ["secureToken", "browserData"]
  },
  "_meta": {
    "ui": {
      "visibility": ["app"]
    }
  }
}

More about UI resources

The UI itself is delivered through resources/read, just like any other MCP resource. Here's its format and metadata:

  • uses the ui:// URI scheme
  • is returned through resources/read
  • typically uses the MIME type text/html;profile=mcp-app
  • may include _meta.ui data such as CSP, permissions, domain hints, and display preferences

This creates a clean separation of command and UI:

  • the tool provides the action and the structured result
  • the resource provides the view and its rendering constraints.

resources/list tells the host which UI resources exist:

{
  "resources": [
    {
      "name": "Payment View",
      "uri": "ui://payments/payment-view",
      "description": "UI resource for the payment view.",
      "mimeType": "text/html;profile=mcp-app"
    },
    {
      "name": "Card Form View",
      "uri": "ui://setup/card-form-view",
      "description": "UI resource for the card entry form.",
      "mimeType": "text/html;profile=mcp-app"
    }
  ]
}

And resources/read returns the actual HTML along with the CSP constraints the host must enforce:

{
  "contents": [{
    "uri": "ui://payments/payment-view",
    "mimeType": "text/html;profile=mcp-app",
    "_meta": {
      "ui": {
        "csp": {
          "frameDomains":    ["https://api.example.com", "https://auth.partner.com"],
          "resourceDomains": ["https://api.example.com"],
          "connectDomains":  ["https://api.example.com"]
        },
        "prefersBorder": false
      }
    },
    "text": "..."
  }]
}

Why an iframe?

The specification is deliberately strict here: the host should render the view through a sandboxed iframe and communicate with it through an intermediate sandbox proxy.

The host page and the sandbox must have different origins. The sandbox must run with allow-scripts and allow-same-origin. The raw HTML resource is then loaded into that controlled environment, not directly into the host page.

That creates a clear boundary between the host application, the sandbox proxy, and the embedded MCP App view.

Once the iframe is running, the view and the host talk through postMessage using JSON-RPC 2.0.

The lifecycle starts with a UI-specific handshake:

  1. The view sends ui/initialize.
  2. The host replies with protocol version, host capabilities, host info, and host context, which includes things like the current color theme, display mode, container dimensions, and locale, so the view can render consistently with the host's environment.
  3. The view sends ui/notifications/initialized.

After that, the host can provide the tool input and tool result to the view, and the view can start making requests.

The message set includes lifecycle messages, ping, notifications/message, resources/read, and tools/call, plus UI-specific requests such as display-mode and resize handling.

The sandbox proxy

The sandbox proxy is the isolation boundary that sits between the host and the raw HTML view.

The expected flow is:

  1. The sandbox signals readiness with ui/notifications/sandbox-proxy-ready.
  2. The host sends ui/notifications/sandbox-resource-ready with the raw HTML resource and its metadata.
  3. The sandbox loads the HTML with the declared restrictions.
  4. The sandbox forwards messages in both directions.

It's not a layer that contains any business code: It's an isolation boundary and a forwarding layer.

Conclusion

MCP Apps lets a conversational workflow stay conversational until a real UI is needed. At that point, the protocol has a standard way to move from text into an embedded interface, and in a secure way.