MCP Security Methodology

How we score the security of any MCP server

Every finding cites a published taxonomy entry — mcp-scan or SAFE-MCP. No invented rules. This page is the live catalog.

18rules8categories2taxonomies cited23technique IDs

Principles

Four rules our security analyzer must follow

No invented rules

Every check we run cites a public taxonomy: mcp-scan (Invariant Labs) or SAFE-MCP (a MITRE ATT&CK-style framework for MCP). If a check has no citation, it doesn't ship. The catalog below is auditable for this reason.

Deterministic first, LLM additive

Most rules run as deterministic Python over the tool metadata. The LLM judge layers on top for prompt-injection-shape detection and for context-sensitive checks like third-party origin mismatch. If the LLM judge fails or times out, the deterministic findings still ship — we never block a result on the LLM.

Metadata only — never call the tool

The analyzer reads names, descriptions, JSON Schema fields, and annotations. It never invokes a tool to find out what it does. That keeps the audit safe to run against any unverified server.

The same engine runs on the analyzer page and on every published listing

Whether you paste a URL into the public analyzer or score one of your own published servers, the rule catalog is identical. There is no privileged scoring path for partners.

How it works

Three layers, one report

Deterministic metadata

Walks the tool's name, description, JSON Schema (full recursion), and annotations. Catches steganography, schema-poisoning instructions, mutating verbs flagged read-only, homoglyph names, consent-fatigue prose, and data-harvest prose.

LLM judge (Anthropic Haiku 4.5)

Runs after the deterministic pass. Looks for instruction-control prose, third-party origin mismatch (server says it's Stripe but the host doesn't match), and credential-setup language directed at the user. Fails open on any error or 25-second timeout.

Cross-server intelligence (coming)

Compares a candidate listing to every claimed listing in our directory: host typosquats, tool-name collisions, and tool-definition drift on claimed listings. Designed; not yet shipped — we'd rather omit it than ship low-precision noise.

Live rule catalog

Every rule we evaluate

Rendered live from the analyzer source. Sorted by default severity, then category. Each citation links to the published entry on GitHub — read the source before you trust the score.

Analyzer: mcp-security-v1 · schema v2

E001Critical

Prompt injection / tool poisoning in tool description

Category: Tool poisoning

mcp-scan: E001 SAFE-MCP: SAFE-T1001

MCPB-A002High

Annotation honesty (read-only conflicts with mutating verbs)

Category: Host controls

SAFE-MCP: SAFE-T1406 SAFE-MCP: SAFE-T1104

E002High

Cross-tool reference (tool shadowing)

Category: Tool identity

mcp-scan: E002 SAFE-MCP: SAFE-T1008 SAFE-MCP: SAFE-T1301

MCPB-I001High

Tool name contains homoglyph / mixed-script characters

Category: Tool identity

SAFE-MCP: SAFE-T1405 SAFE-MCP: SAFE-T1103

SAFE-T1402High

Instruction steganography (zero-width / HTML-comment hidden text)

Category: Tool poisoning

SAFE-MCP: SAFE-T1402

SAFE-T1501High

Full-schema poisoning (instructions in JSON Schema fields)

Category: Tool poisoning

SAFE-MCP: SAFE-T1501

SAFE-T1007Medium

Tool metadata describes credential setup or token handling

Category: Auth & scope safety

SAFE-MCP: SAFE-T1007

SAFE-T1804Medium

Tool description advertises silent data harvesting

Category: Context injection

SAFE-MCP: SAFE-T1804

W015Medium

Tool exposes agent to untrusted third-party content

Category: Context injection

mcp-scan: W015 SAFE-MCP: SAFE-T1102

W017Medium

Tool retrieves highly sensitive private data

Category: Context injection

mcp-scan: W017

SAFE-T1403Medium

Consent-fatigue language pressuring auto-approval

Category: Host controls

SAFE-MCP: SAFE-T1403

W019Medium

Tool grants destructive shared-infrastructure capability

Category: Privileged capabilities

mcp-scan: W019 SAFE-MCP: SAFE-T1104

SAFE-T1004Medium

Third-party tool origin / server-name mismatch

Category: Supply chain & provenance

SAFE-MCP: SAFE-T1004 SAFE-MCP: SAFE-T1008

MCPB-A003Medium

Duplicate tool display name within server

Category: Tool identity

SAFE-MCP: SAFE-T1405

MCPB-A001Low

Annotation completeness (readOnlyHint / destructiveHint)

Category: Auditability

SAFE-MCP: SAFE-T1406

W018Low

Tool exposes local workspace or source files

Category: Context injection

mcp-scan: W018

W020Low

Tool grants local destructive capability

Category: Privileged capabilities

mcp-scan: W020 SAFE-MCP: SAFE-T1105

W001Low

Suspicious words in tool description

Category: Tool poisoning

mcp-scan: W001

Where the rules come from

Two published taxonomies, one rule catalog

SAFE-MCP

MITRE ATT&CK-style threat framework for MCP

A peer-reviewed taxonomy of MCP-specific adversary techniques. We cite SAFE-T1001 (instruction injection), SAFE-T1004 (third-party origin mismatch), SAFE-T1007 (credential setup), SAFE-T1402 (instruction steganography), SAFE-T1403 (consent fatigue), SAFE-T1804 (API data harvest), and a dozen more.

mcp-scan

Static scanner from Invariant Labs

The first widely-used MCP static scanner. Its issue codes (E001 prompt injection, W001 attention-control words, W015–W020 toxic flow capabilities) define a baseline that every public scoreboard expects. We cite the same codes.

About the MCPB- rule prefix

Three rules use our internal MCPB- namespace (annotation completeness, annotation honesty, duplicate display names, homoglyph names). Every one of them still cites a SAFE-MCP technique — the prefix only signals that the published taxonomy hasn't given the specific check its own ID yet. When SAFE-MCP issues one, we adopt it.

Audit any remote MCP server

Paste a URL. Get the security posture, the quality score, and every finding cited against the published taxonomy.