Skip to main content

Writing great tool schemas for MCP

· 3 min read
MCPBundles

Here's the thing about schemas: they're basically the contract your model learns from. Get them right, and your tools are easy to find, hard to break, and simple to fix when something goes wrong.

Get them wrong, and you'll spend way too much time debugging why the model keeps calling your tool incorrectly.

Cartoon illustration of a person writing great tool schemas for MCP, happy expression
Learn JSON Schema patterns that make MCP tools discoverable, easy to use reliably, and help models recover from errors gracefully.

Principles

Start with a clear description. One sentence that tells the model when to use this tool. That's it. Don't overthink it.

Make your inputs precise. Use JSON Schema properly—mark what's required vs optional, use enums when you can, add formats and ranges. The model needs constraints to work with, not a blank canvas.

Examples help. Throw in a couple of representative input objects. Models learn from examples, so give them good ones.

Keep outputs small. Return stable IDs, short summaries, and explicit pagination cursors. Don't dump huge blobs of data—provide URIs if someone needs the full thing.

Oh, and make things idempotent when you can. If someone retries a call, it shouldn't create duplicates or cause chaos.

Inputs: patterns that help models

Use enums instead of free-text strings for modes. Trust me on this one. If you let the model guess between "public", "Public", and "PUBLIC", you're asking for trouble.

Prefer oneOf when you've got mutually exclusive shapes. Like searching by id or query—can't do both, so make that clear in the schema.

Add constraints. minLength, maximum, format (email, uri, date-time). These reduce guessing and catch errors early.

Use clear field names. Snake_case is fine. Match your domain language. If your team calls it a "report_id", don't call it "reportId" in the schema.

Here's what a decent schema looks like:

{
"name": "publish_report",
"description": "Publish an existing report to the team with an optional note.",
"input_schema": {
"type": "object",
"required": ["report_id", "visibility"],
"properties": {
"report_id": {"type": "string", "minLength": 6, "description": "Stable ID"},
"visibility": {"type": "string", "enum": ["private", "team", "org"]},
"note": {"type": "string", "maxLength": 500}
},
"additionalProperties": false
}
}

Notice the enum for visibility? That's intentional. And additionalProperties: false means the model can't add random fields you didn't expect.

Errors: make recovery straightforward

Validate on the server side. Always. Then return structured errors with a machine-readable code field.

Include a short, user-safe message. Maybe add hints if the model can actually do something about it.

Map validation failures to schema paths. If report_id is too short, tell the model exactly which field failed. Don't make it guess.

Outputs: predictable and small

Return stable identifiers. If there's a next action, include it. For lists, add next_cursor and has_more. Don't make the model parse your response to figure out if there's more data.

Avoid embedding large blobs. If someone needs the full resource, give them a URI and let them fetch it separately.

Testing schemas

Write golden prompts. Test valid shapes and invalid ones. See what breaks.

Add fuzz tests for boundary values. Empty strings, max lengths, enum typos. The weird edge cases are where things fail.

Track error rates. When failures cluster around a specific field or pattern, tighten the schema there. Iterate.

References