Skip to main content

Introduction to MCP: What You Need to Know

· 7 min read
MCPBundles

For the first month or two of building agents on Claude, I spent most of my debugging time staring at hallucinated API endpoints. The model would confidently POST to URLs that didn't exist, invent function names, and produce JSON that wouldn't parse. Switching the integration over to the Model Context Protocol (MCP) didn't fix everything, but it removed the entire class of "the model made up an endpoint" failures, which were the bulk of what we were chasing.

The rest of this post is the explanation I wish I'd had on day one.

Cartoon illustration of a person learning about MCP Model Context Protocol introduction, happy expression
A practical introduction to the Model Context Protocol (MCP) with real examples, common pitfalls, and why it matters for building AI agents that actually work.

The Problem MCP Solves

Before MCP, the options for connecting an LLM to a real system were all bad in their own way. You could paste API docs into the prompt and hope the model generated valid calls — which worked often enough to be tempting and rarely enough to ship a feature. You could write a custom integration per host (we ended up with one for Claude Desktop, one for an internal tool, and a half-finished one for an early Cursor build). Or you could let the agent drive a browser against the website, which is slow, fragile, and an absolute nightmare to debug.

MCP is mostly an attempt to make option one actually work: instead of putting the API surface in the prompt, the host asks the server what's available at runtime, and the server returns a typed schema for each callable thing. The model picks one, fills in the arguments, and the server runs them.

For us, the practical effect was being able to delete two of those three integrations and keep one MCP server that the existing hosts can all talk to.

How MCP Works

The wire protocol has a host (Claude Desktop, Cursor, ChatGPT — wherever the model lives), a client inside the host that speaks JSON-RPC, and a server you write that exposes whatever the agent should be able to do. The host kicks off discovery, the client carries the messages, and your server is where the actual work happens.

Most of what used to live in our prompts as "here's how to call this API and please don't get it wrong" now lives as schema and validation in the server. Worth noting because it's the part that surprised me: most of the reliability win comes from the schema, not from anything the model is doing differently.

What a server actually exposes

The thing the agent calls is a tool. A tool has a name, a description, an input schema, and a function on your end that runs when the model invokes it. Names like search_bundles, create_project, send_email — verbs, because that's how the model thinks about them. The schema is what stops the model from inventing parameters; if it tries, the host rejects the call before it gets to your code.

Servers can also expose resources, which are read-only blobs the model can pull on demand. Documentation, a setup guide, the contents of a file. We use these heavily for "how do I configure this" questions because the alternative is the model improvising setup steps that look plausible and aren't.

The third primitive is prompts — reusable prompt templates the host can surface to the user. We've barely touched these in production; tools and resources are where almost all the value is so far, and I won't pretend otherwise.

How MCP differs from a REST API

The temptation when you first see MCP is to read it as "REST with a different transport". It's worth resisting. REST is built around the client knowing the URL space ahead of time and constructing requests against documented endpoints. MCP inverts that: the client (the agent) doesn't know what's available until it asks, and what comes back isn't GET /things but a verb-shaped action with a typed argument list.

In practice that means an agent stops trying to translate "send the customer an email" into "POST /api/v2/messages with this exact body" and just calls a send_email tool that already knows the body. That's the bit that makes MCP feel different to work with — not the JSON-RPC, not the transport, the verb-shaped surface.

Things we got wrong

The biggest one was tool count. Our first server exposed every endpoint of an internal API as a separate tool — about 60 of them. Claude's tool selection accuracy fell off a cliff somewhere past 40, and we had to collapse most of them into a single search tool with structured filters before reliability came back. The current rule of thumb is: if two tools differ only in their filter args, they should be one tool with a filter argument.

We also spent a while passing api_key as a parameter on every tool, which is the kind of thing that feels reasonable until you see it in the model's tool schema and realise the agent is now reasoning about credentials. Pull auth out of the tool surface entirely; the server should read it from headers or a credential store and the model should never see it.

A third thing worth knowing before you start: stdio is the right default for local dev (it's a pipe, it's fast, there's nothing to configure). SSE / streamable HTTP is the right answer when the host and the server live in different processes or different machines. Don't reach for SSE first because it sounds more grown-up — it just adds moving parts.

And on errors: returning a Python traceback or a 500 page makes the model do something unhelpful, usually retrying the same call. Returning a short, structured error string ("missing required field customer_id") makes it actually course-correct. Treat error messages as part of the tool's interface.

Building Your First MCP Server (15 Minutes)

Here's a minimal MCP server using FastMCP:

from fastmcp import FastMCP
from pydantic import Field
from typing import Annotated

mcp = FastMCP("MyFirstServer")

@mcp.tool(description="Greet a user by name.")
async def greet_user(
name: Annotated[str, Field(description="The user's name")]
) -> dict:
return {"message": f"Hello, {name}! Welcome to MCP."}

if __name__ == "__main__":
mcp.run()

Run it:

pip install fastmcp
fastmcp run server.py

Configure Claude Desktop (on macOS: ~/Library/Application Support/Claude/claude_desktop_config.json):

{
"mcpServers": {
"MyFirstServer": {
"command": "python",
"args": ["/path/to/server.py"]
}
}
}

Restart Claude Desktop and ask: "Greet me by name."

Where this is going

The protocol itself is moving quickly. The reliable-tool-count ceiling has actually been getting tighter as the ecosystem has matured: a year ago people were comfortably running 50–60 tools per server, and now most of the practical guidance is to stay under 40. The hosts have been quietly catching up too — Anthropic, OpenAI and Google all ship some form of MCP client now, with varying degrees of completeness.

We built MCPBundles partly because the per-integration work that MCP was supposed to remove kept reappearing as per-MCP-server work. The bundles are pre-built MCP servers for a long tail of SaaS products, so the part you used to write yourself is already there. Useful or not, the broader point still holds: if you're integrating an LLM with anything stateful, the question is which MCP server you use, not whether you use one.

Further reading

If you're building a server and hit something the docs don't cover, we're happy to swap notes.