Introduction to MCP
MCP, the Model Context Protocol, gives an LLM a standard way to work with external API's. An MCP server can expose three types of primitives:
- tools: to allow the LLM to perform actions
- resources: read-only content to provide context to the LLM
- prompts: to expose reusable prompt templates
All of this focuses on providing the LLM with context: which tools it can use, what data it can read, even its behavior (persona or skills via prompts).
What's missing: a standardized way for a server to deliver an interactive UI to the user.
Why MCP Apps exists
As MCP does not define how to deliver an interactive UI to the client, the only option is to develop a custom implementation, which is fine when you are doing your own chatbot on your website, but will not work for users directly using a public LLM chatbot that tries to connect to your website.
MCP Apps is the standard to solve that. It lets an MCP server associate a tool with a UI resource, and it defines how the host should render that UI and communicate with it.
The important point: that UI is part of the MCP system and not just an unrelated webpage.
The basic model
MCP Apps introduces three main pieces:
- UI resources, an MCP resource declared with the
ui://URI scheme. - metadata, that you add on a tool to link it to a UI resource
- bidirectional communication between the host and the embedded UI using JSON-RPC
A UI resource is still an MCP resource, but with specific rules. In the current specification, the main content type is HTML with the MIME type text/html;profile=mcp-app.
At a high level, the flow is:
- An MCP server exposes a tool with
_meta.ui.resourceUripointing to a UI resource. - The host discovers the tool via
tools/listand reads the UI metadata. It may prefetch the UI resource at this point. - The model calls the tool.
- The host executes the tool call and fetches the UI resource (if not already cached).
- The host tells its frontend to render the UI.
- The frontend renders the view in a sandboxed iframe.
- The iframe and the host exchange MCP-style JSON-RPC messages via
postMessage. The host proxies tool calls and resource reads to the MCP server.
MCP Apps therefore does not replace tools but extends them through additional conventions.
Linking tools to UI and controlling visibility
Like said earlier, the connection between a tool and its UI lives in tool metadata, more precisely inside _meta.ui: field resourceUri identifies the UI resource to render.
But not all tools are meant for the LLM: some exist purely to serve the iframe. The field _meta.ui.visibility controls who can use a tool:
modelmeans the tool is visible to and callable by the modelappmeans the tool is callable by the embedded app
If visibility is omitted, the default is both model and app.
Here is what that looks like in practice. The initiate_payment tool is visible to the model and points to a UI resource. The get_auth_challenge_options tool is app-only — the model never sees it, but the embedded view can call it directly:
// Model-visible tool — triggers the UI
{
"name": "initiate_payment",
"description": "Initiate a payment with the specified payment method.",
"inputSchema": {
"type": "object",
"properties": {
"paymentMethodId": { "type": "integer" }
},
"required": ["paymentMethodId"]
},
"_meta": {
"ui": {
"resourceUri": "ui://payments/payment-view",
"visibility": ["model"]
}
}
}
// App-only tool — callable by the iframe, hidden from the model
{
"name": "get_auth_challenge_options",
"description": "Called by the payment view to retrieve authentication options.",
"inputSchema": {
"type": "object",
"properties": {
"secureToken": { "type": "string" },
"browserData": {}
},
"required": ["secureToken", "browserData"]
},
"_meta": {
"ui": {
"visibility": ["app"]
}
}
}
More about UI resources
The UI itself is delivered through resources/read, just like any other MCP resource. Here's its format and metadata:
- uses the
ui://URI scheme - is returned through
resources/read - typically uses the MIME type
text/html;profile=mcp-app - may include
_meta.uidata such as CSP, permissions, domain hints, and display preferences
This creates a clean separation of command and UI:
- the tool provides the action and the structured result
- the resource provides the view and its rendering constraints.
resources/list tells the host which UI resources exist:
{
"resources": [
{
"name": "Payment View",
"uri": "ui://payments/payment-view",
"description": "UI resource for the payment view.",
"mimeType": "text/html;profile=mcp-app"
},
{
"name": "Card Form View",
"uri": "ui://setup/card-form-view",
"description": "UI resource for the card entry form.",
"mimeType": "text/html;profile=mcp-app"
}
]
}
And resources/read returns the actual HTML along with the CSP constraints the host must enforce:
{
"contents": [{
"uri": "ui://payments/payment-view",
"mimeType": "text/html;profile=mcp-app",
"_meta": {
"ui": {
"csp": {
"frameDomains": ["https://api.example.com", "https://auth.partner.com"],
"resourceDomains": ["https://api.example.com"],
"connectDomains": ["https://api.example.com"]
},
"prefersBorder": false
}
},
"text": "..."
}]
}
Why an iframe?
The specification is deliberately strict here: the host should render the view through a sandboxed iframe and communicate with it through an intermediate sandbox proxy.
The host page and the sandbox must have different origins. The sandbox must run with allow-scripts and allow-same-origin. The raw HTML resource is then loaded into that controlled environment, not directly into the host page.
That creates a clear boundary between the host application, the sandbox proxy, and the embedded MCP App view.
Once the iframe is running, the view and the host talk through postMessage using JSON-RPC 2.0.
The lifecycle starts with a UI-specific handshake:
- The view sends
ui/initialize. - The host replies with protocol version, host capabilities, host info, and host context, which includes things like the current color theme, display mode, container dimensions, and locale, so the view can render consistently with the host's environment.
- The view sends
ui/notifications/initialized.
After that, the host can provide the tool input and tool result to the view, and the view can start making requests.
The message set includes lifecycle messages, ping, notifications/message, resources/read, and tools/call, plus UI-specific requests such as display-mode and resize handling.
The sandbox proxy
The sandbox proxy is the isolation boundary that sits between the host and the raw HTML view.
The expected flow is:
- The sandbox signals readiness with
ui/notifications/sandbox-proxy-ready. - The host sends
ui/notifications/sandbox-resource-readywith the raw HTML resource and its metadata. - The sandbox loads the HTML with the declared restrictions.
- The sandbox forwards messages in both directions.
It's not a layer that contains any business code: It's an isolation boundary and a forwarding layer.
Conclusion
MCP Apps lets a conversational workflow stay conversational until a real UI is needed. At that point, the protocol has a standard way to move from text into an embedded interface, and in a secure way.