OpenAI's MCP Integration Requirements: Why Search and Fetch Matter
When OpenAI integrated support for Anthropic's Model Context Protocol (MCP) into ChatGPT's deep research feature, they documented something elegant: a two-tool pattern that gives AI agents a consistent way to engage with any data source. If your MCP server implements search
and fetch
with their specific signatures, ChatGPT knows exactly how to explore your data without custom integration code.
Both tools accept only a single string parameter. That constraint isn't a limitation—it's what makes the pattern universal.
The Pattern: Two Tools, Universal Interface
For MCP servers to work with ChatGPT's deep research feature, OpenAI requires a specific contract: implement two tools—search
and fetch
—with exact signatures. Any MCP server that follows this pattern can integrate with ChatGPT without custom code:
search
- Discovery with a Single String
Takes: One query
string
Returns: A JSON-encoded array of result objects, each with:
id
- unique identifier you'll use later infetch
title
- human-readable nameurl
- canonical URL for citations
The tool must return exactly one MCP content item with type: "text"
and a JSON-stringified results array:
{
"content": [
{
"type": "text",
"text": "{\"results\":[{\"id\":\"weaviate:doc:uuid-123\",\"title\":\"Q4 Product Strategy\",\"url\":\"https://internal.acme.com/docs/q4-strategy\"}]}"
}
]
}
Think of search
as a constrained list endpoint. Since it only takes a single string, the server has freedom to interpret that string however it wants: semantic search, LLM-powered query parsing, structured filter extraction, full-text search, or a hybrid approach. The AI submits natural language; your server decides how to process it.
fetch
- Retrieval with a Single String
Takes: One id
string (from a previous search result)
Returns: A JSON-encoded document object with:
id
- the identifier that was requestedtitle
- document titletext
- the full content (this is what the AI needs)url
- canonical URL for citationmetadata
- optional key-value pairs about the document
Like search
, this must be a single MCP content item with JSON-stringified payload:
{
"content": [
{
"type": "text",
"text": "{\"id\":\"weaviate:doc:uuid-123\",\"title\":\"Q4 Product Strategy\",\"text\":\"Our Q4 strategy focuses on three pillars...\",\"url\":\"https://internal.acme.com/docs/q4-strategy\",\"metadata\":{\"author\":\"Jane Doe\",\"updated\":\"2025-10-01\"}}"
}
]
}
The single-string constraint means you'll often need to encode multiple pieces of information into the id
. For example:
github:repo:acme/backend
- repository identifierconfluence:page:123456
- page IDsalesforce:opportunity:006abc
- CRM recordweaviate:collection:products:uuid-789
- vector store document with collection context
The fetch
tool becomes your standard GET request, but instead of REST path parameters and query strings, you're packing everything into one identifier string that your server knows how to parse and route.
Writing Tool Descriptions That Guide AI
The AI reads two pieces of documentation for each tool: the tool-level description and the parameter-level description. Both matter. Good descriptions reduce errors and help the AI construct valid calls.
Tool-Level Description
Tell the AI what the tool does and when to use it:
@mcp.tool(description="Search documentation, API specs, and internal guides by query - follows OpenAI MCP standard")
async def search(
query: Annotated[str, Field(description="Natural language search query or structured filters like 'tag:api status:published'")]
) -> dict:
# Implementation
The tool description explains purpose and scope. The parameter description explains what formats are valid.
Parameter-Level Description
Guide the AI on how to construct the parameter:
@mcp.tool(description="Fetch full document content by ID - follows OpenAI MCP standard")
async def fetch(
id: Annotated[str, Field(description="Document ID in 'source:type:identifier' format (e.g., 'confluence:page:12345', 'github:issue:repo/123')")]
) -> dict:
# Implementation
The parameter description shows the ID format with examples. The AI learns the pattern and applies it.
Providing Structure in Descriptions
If your search
tool accepts structured queries, document the schema in the description:
@mcp.tool(description="""Search customer records with optional structured filters.
Accepts natural language OR structured syntax:
- Natural: "enterprise customers in California"
- Structured: "tier:enterprise region:us-west status:active"
Supported filters: tier, region, status, created_after, value_gt
""")
async def search(
query: Annotated[str, Field(description="""Search query as natural language OR 'field:value' pairs separated by spaces.
Supported fields: tier (enterprise|pro|starter), region (us-west|us-east|eu), status (active|archived), created_after (ISO date), value_gt (number)
Example: "tier:enterprise status:active" or "show me enterprise customers in California" """)]
) -> dict:
# Parse the query string
if ":" in query and not query.startswith("http"):
# Structured query
filters = parse_structured_query(query)
else:
# Natural language - use LLM to extract intent
filters = await parse_with_llm(query)
# Execute search with filters
return await search_crm(filters)
The AI sees the supported filters and learns to construct structured queries when precision matters.
Alternative: JSON-Encoded Parameters
You can also accept JSON strings in the parameter, which works well when you need complex nested structures:
@mcp.tool(description="Search products with complex filters - follows OpenAI MCP standard")
async def search(
query: Annotated[str, Field(description="""Search query as natural language OR JSON object.
JSON format: {"category": "electronics", "price": {"min": 100, "max": 500}, "in_stock": true, "tags": ["featured", "sale"]}
Supported JSON fields:
- category (string): product category
- price (object): {min: number, max: number}
- in_stock (boolean): availability filter
- tags (array): list of tag strings
- created_after (string): ISO date
Natural language: "featured electronics under $500 in stock"
""")]
) -> dict:
# Try to parse as JSON first
try:
filters = json.loads(query)
except json.JSONDecodeError:
# Fall back to natural language parsing
filters = await parse_with_llm(query)
# Execute search with parsed filters
results = await database.search(filters)
return format_search_results(results)
The AI learns to send either JSON like {"category": "electronics", "price": {"min": 100}}
or natural language like "show me electronics under $500".
Error Responses That Teach
When the AI sends an invalid request, return errors that help it fix the problem:
@mcp.tool(description="Fetch document by ID - follows OpenAI MCP standard")
async def fetch(id: str) -> dict:
# Validate ID format
if ":" not in id:
return {
"content": [{
"type": "text",
"text": json.dumps({
"error": "INVALID_ID_FORMAT",
"message": "ID must be in 'source:type:identifier' format",
"examples": ["confluence:page:12345", "github:issue:repo/123"],
"received": id
})
}]
}
# Parse ID
parts = id.split(":", 2)
if len(parts) != 3:
return {
"content": [{
"type": "text",
"text": json.dumps({
"error": "INVALID_ID_STRUCTURE",
"message": f"Expected 3 parts (source:type:identifier), got {len(parts)}",
"received": id
})
}]
}
source, doc_type, identifier = parts
# Validate source
if source not in ["confluence", "github", "notion"]:
return {
"content": [{
"type": "text",
"text": json.dumps({
"error": "UNKNOWN_SOURCE",
"message": f"Unknown source '{source}'",
"supported_sources": ["confluence", "github", "notion"],
"hint": f"Try '{id.replace(source, 'confluence')}' if this is a Confluence document"
})
}]
}
# Document not found
doc = await get_document(source, doc_type, identifier)
if not doc:
return {
"content": [{
"type": "text",
"text": json.dumps({
"error": "DOCUMENT_NOT_FOUND",
"message": f"No document found with ID '{id}'",
"hint": "Use the search tool to find valid document IDs"
})
}]
}
# Success
return {
"content": [{
"type": "text",
"text": json.dumps({
"id": id,
"title": doc.title,
"text": doc.content,
"url": doc.url,
"metadata": doc.metadata
})
}]
}
When the AI receives an error with a clear code, message, and hint, it can adjust its next attempt. This creates a learning loop.
Using LLMs Server-Side
For search
, you can use an LLM on your server to parse natural language into structured filters:
async def parse_search_with_llm(query: str) -> dict:
"""Use LLM to extract structured filters from natural language."""
prompt = f"""Extract search filters from this query: "{query}"
Available filters:
- entity_type: project, customer, document, ticket
- status: active, archived, closed, open
- created_after: ISO date
- priority: low, medium, high, critical
- tag: any string
Return JSON with extracted filters. If uncertain, return empty dict.
Examples:
"critical bugs from last week" -> {{"entity_type": "ticket", "priority": "critical", "created_after": "2025-10-02"}}
"active enterprise customers" -> {{"entity_type": "customer", "status": "active", "tag": "enterprise"}}
"""
response = await llm.complete(prompt)
return json.loads(response)
@mcp.tool(description="Search across projects, customers, documents, and tickets - follows OpenAI MCP standard")
async def search(
query: Annotated[str, Field(description="""Natural language search query. Server will parse into structured filters.
Server extracts: entity_type (project|customer|document|ticket), status (active|archived|closed|open), priority (low|medium|high|critical), created_after (ISO date), tag (string)
Examples: "critical bugs from last week" or "active enterprise customers" """)]
) -> dict:
# Parse with LLM
filters = await parse_search_with_llm(query)
# Execute structured search
results = await database.search(filters)
return {
"content": [{
"type": "text",
"text": json.dumps({
"results": [
{"id": f"{r.type}:{r.id}", "title": r.title, "url": r.url}
for r in results
],
"parsed_filters": filters # Show the AI what filters were used
})
}]
}
Including parsed_filters
in the response helps the AI learn what worked. Next time it can construct better queries.
Why Single-String Parameters Are Smart
Accepting only one string parameter might seem limiting at first. Why not structured parameters with separate fields for filters, pagination, or resource types?
The single-string constraint is what makes the pattern work. Here's why:
1. Simplicity for AI Tool Selection
When an AI agent chooses which tool to call, simpler signatures reduce the cognitive load. A tool with one parameter is easier to reason about than a tool with five parameters across multiple types. The agent doesn't have to construct complex argument objects—it just passes a string.
For search
, the AI naturally thinks: "I need to find documents about X" → call search with "X"
For fetch
, the AI naturally thinks: "I need the full content of document Y" → call fetch with "Y"
No decisions about which fields to include, how to structure filters, or what pagination strategy to use. Just one string.
2. Server-Side Intelligence
The single-string constraint pushes complexity to the server, where it belongs. Your server can:
For search
:
- Parse natural language queries with an LLM
- Extract structured filters from unstructured text
- Perform semantic search across embeddings
- Apply user-specific access controls
- Rank and filter results based on relevance
- Handle pagination internally and return top-N results
For fetch
:
- Parse composite IDs and route to the correct data source
- Apply field selection based on ID patterns
- Enforce permissions based on user context
- Transform data into AI-readable formats
- Handle cross-resource relationships
The AI doesn't need to know your internal routing, authentication, or data model. It sends a string; you handle the rest.
3. Future-Proof Extensibility
Because the parameters are strings, you can evolve your parsing logic without breaking the tool interface. Add new query syntax, support new ID formats, introduce new metadata—all without changing the tool signature that AI agents depend on.
For example, your search
string parsing might evolve from:
Version 1: Simple keyword search
"customer feedback"
Version 2: Add date filters
"customer feedback after:2025-01-01"
Version 3: Add semantic operators
"customer feedback after:2025-01-01 sentiment:negative"
Version 4: LLM-powered intent extraction
"show me recent negative feedback from enterprise customers"
The AI still sends one string. Your server gets smarter about parsing it. No schema changes required.
What Search and Fetch Provide (And What They Don't)
The search
and fetch
pattern is deliberately focused. These two tools provide a foundation, not a complete solution. Understanding what they're designed for—and what they're not—helps you design better MCP servers.
Beyond the Standard
Write operations: Search and fetch are read-only. You can't create, update, or delete records.
Complex workflows: Multi-step operations like "create a ticket and assign it to the on-call engineer" require additional tools.
Rich filtering: A single search string can't express complex AND/OR filter combinations as clearly as structured parameters.
Batch operations: Fetching 50 documents one at a time is inefficient compared to a batch GET tool (see our batch consolidation post).
Typed discovery: If your domain has distinct entity types (projects, tickets, customers), you might want type-specific list endpoints with model-specific filters.
The Foundation They Create
search
and fetch
solve the cold-start problem. When an AI agent encounters your MCP server for the first time with zero domain knowledge:
- Discovery: Call
search
with a broad query to find what exists - Retrieval: Call
fetch
on interesting results to get full content - Understanding: Read the content and decide next steps
This two-tool pattern is enough to make your data source usable by AI, even if it's not complete. And for OpenAI's deep research mode, it's exactly what's needed: iterative search to discover documents, fetch to read them, synthesize insights.
Think of search
and fetch
as the minimum viable interface for AI-readable data sources. They're the HTTP GET of the MCP world—basic, universal, and good enough to get started.
Adapting Your MCP Server to the Standard
Different data sources require different approaches to implement search
and fetch
. Here's how to adapt the pattern to common backend systems:
Vector Store (Semantic Search)
@mcp.tool(description="Search documentation by semantic similarity - follows OpenAI MCP standard")
async def search(
query: Annotated[str, Field(description="Search query (natural language)")]
) -> dict:
# Embed the query
embedding = await embed_text(query)
# Semantic search in Weaviate/Pinecone/Qdrant
results = await vector_db.search(
collection="documents",
vector=embedding,
limit=10
)
# Return with standard format
return format_search_results([
{"id": f"vector:{r.id}", "title": r.metadata["title"], "url": r.metadata["url"]}
for r in results
])
@mcp.tool(description="Fetch full document content - follows OpenAI MCP standard")
async def fetch(
id: Annotated[str, Field(description="Document ID from search results (format: 'vector:uuid')")]
) -> dict:
# Parse and validate
if not id.startswith("vector:"):
return format_error("INVALID_ID", "ID must start with 'vector:'", id)
_, doc_id = id.split(":", 1)
doc = await vector_db.get(doc_id)
if not doc:
return format_error("NOT_FOUND", f"Document {doc_id} not found", id)
# Return full content
return format_document(
id=id,
title=doc.metadata["title"],
text=doc.content,
url=doc.metadata["url"],
metadata={"created": doc.metadata["created"], "author": doc.metadata["author"]}
)
Use helper functions to keep the tool code clean and ensure consistent response formats.
CRM System (Structured Data with LLM Parsing)
@mcp.tool(description="""Search CRM opportunities, contacts, and accounts - follows OpenAI MCP standard
Accepts natural language or structured syntax:
- Natural: "high-value opportunities closing this quarter"
- Structured: "type:opportunity stage:closing value:>100000"
""")
async def search(
query: Annotated[str, Field(description="""Search query as natural language OR 'field:value' pairs.
Structured fields: type (opportunity|contact|account), stage (prospecting|qualification|closing|closed), value (>N, <N, N), owner (name or id), created_after/before (ISO date)
Examples: "type:opportunity stage:closing value:>100000" or "high-value opportunities closing this quarter" """)]
) -> dict:
# Use LLM to parse natural language into structured filters
filters = await parse_query_with_llm(query)
# Example: "high-value opportunities closing this quarter"
# → {object_type: "opportunity", stage: "closing", value_gt: 100000, close_date_q: "Q4"}
# Query Salesforce/HubSpot API with parsed filters
records = await crm_api.query(
object_type=filters.get("object_type", "opportunity"),
filters=filters,
limit=20
)
return format_search_results([
{"id": f"crm:{r.type}:{r.id}", "title": r.name, "url": r.web_url}
for r in records
])
@mcp.tool(description="Fetch full CRM record details - follows OpenAI MCP standard")
async def fetch(
id: Annotated[str, Field(description="Record ID from search (format: 'crm:type:id' like 'crm:opportunity:006abc')")]
) -> dict:
# Parse: "crm:opportunity:006abc123"
parts = id.split(":", 2)
if len(parts) != 3 or parts[0] != "crm":
return format_error("INVALID_ID", "Expected format: 'crm:type:id'", id)
_, object_type, record_id = parts
record = await crm_api.get(object_type, record_id)
if not record:
return format_error("NOT_FOUND", f"No {object_type} found with ID {record_id}", id)
# Format for AI - include relevant fields as readable text
text = format_crm_record_as_text(record)
return format_document(
id=id,
title=record.name,
text=text,
url=record.web_url,
metadata={"stage": record.stage, "value": record.amount, "owner": record.owner_name}
)
The LLM parsing step converts natural language into API filters, making search intuitive for users.
Knowledge Base (Hybrid Search with Reranking)
@mcp.tool(description="Search internal documentation, guides, and wiki pages - follows OpenAI MCP standard")
async def search(
query: Annotated[str, Field(description="Search query (keywords or natural language)")]
) -> dict:
# Combine full-text and semantic search
fts_results = await elasticsearch.search(query, size=50)
embedding = await embed_text(query)
vector_results = await vector_db.search(embedding, limit=50)
# Merge and rerank with cross-encoder for best relevance
combined = merge_results(fts_results, vector_results)
reranked = await rerank_with_cross_encoder(query, combined, top_k=10)
return format_search_results([
{"id": f"docs:{r.doc_id}", "title": r.title, "url": r.public_url}
for r in reranked
])
Hybrid search combines keyword matching with semantic understanding, then reranks for optimal relevance.
ID Design for Multi-Source Systems
Your fetch
IDs need to encode routing information. Design a consistent format that's easy to parse and extend:
Hierarchical with delimiters (most common):
# Format: "source:type:identifier"
"weaviate:document:uuid-123"
"github:issue:acme/backend/456"
"salesforce:contact:003abc789"
# Parse in fetch:
source, resource_type, identifier = id.split(":", 2)
With collection or namespace:
# Format: "source:collection:type:id"
"weaviate:products:document:uuid-123"
"confluence:engineering:page:98765"
# Parse:
source, collection, resource_type, identifier = id.split(":", 3)
Choose a format that's easy to parse, human-readable for debugging, and flexible enough to add new sources later.
Adding Value Beyond Search and Fetch
Once you've implemented the standard search
and fetch
tools, you can layer on additional capabilities:
Write Operations
@mcp.tool(description="Create a new ticket")
async def create_ticket(
title: str,
description: str,
priority: str = "medium"
) -> dict:
# ...
Batch Operations
@mcp.tool(description="Fetch multiple documents by IDs")
async def batch_fetch(ids: List[str]) -> dict:
# See our batch consolidation post
# ...
Domain Actions
@mcp.tool(description="Assign opportunity to sales rep")
async def assign_opportunity(
opportunity_id: str,
assignee: str
) -> dict:
# Business logic here
# ...
But start with search
and fetch
. They're the foundation that makes your server discoverable and usable by any AI that follows the pattern.
Why This Creates an Ecosystem
Anthropic created MCP as an open protocol in November 2024, and OpenAI adopted it in March 2025. By documenting specific search
and fetch
requirements for ChatGPT integration, OpenAI is establishing a pattern that other AI systems can follow. When multiple AI platforms adopt compatible interfaces:
For server builders:
- Implement once, work with multiple AI platforms
- Clear contract reduces guesswork
- Standard patterns emerge for common problems
For AI systems:
- No custom integration per data source
- Can reason about "search" and "fetch" generically
- Consistent UX across different backend systems
For end users:
- Connect new data sources with predictable behavior
- AI agents work across platforms
- Less configuration, more value
This is similar to how HTTP standardized web interactions. You don't need custom protocols for every website because they all speak HTTP. With search
and fetch
, you don't need custom integrations for every data source because they all speak the same MCP interface.
Implementation Checklist
Building an MCP server that meets OpenAI's requirements for ChatGPT integration:
1. Implement search
tool:
- ✅ Accepts single
query
string parameter - ✅ Returns JSON-stringified results array with
id
,title
,url
- ✅ Wraps in MCP content item with
type: "text"
- ✅ Handles natural language queries
- ✅ Returns relevant, ranked results
- ✅ Applies user-specific access controls
2. Implement fetch
tool:
- ✅ Accepts single
id
string parameter - ✅ Returns JSON-stringified document with
id
,title
,text
,url
,metadata
- ✅ Wraps in MCP content item with
type: "text"
- ✅ Parses composite IDs and routes correctly
- ✅ Returns full document content in
text
field - ✅ Handles missing/invalid IDs gracefully
3. Design your ID format:
- ✅ Encodes enough information for routing
- ✅ Easy to parse and extend
- ✅ Consistent across resource types
- ✅ Human-readable for debugging
4. Test with ChatGPT:
- ✅ Configure MCP URL in ChatGPT connector settings
- ✅ Enable deep research mode
- ✅ Ask questions that require iterative search
- ✅ Verify citations link to correct URLs
- ✅ Check that fetched content is complete
Where This Pattern Goes Next
Anthropic created MCP and uses it with Claude. OpenAI documented specific implementation requirements for ChatGPT integration. As MCP adoption grows, other platforms are likely to adopt similar patterns:
- Anthropic (Claude) - Already supports MCP; may document similar search/fetch requirements
- Google - Could adopt search/fetch patterns for Gemini integrations
- Microsoft - May adopt the pattern for Copilot connectors
- Open-source agents - Likely to expect search/fetch as baseline MCP interface
The two-tool standard creates a lowest common denominator that works across platforms. Your MCP server can expose dozens of specialized tools, but if it implements search
and fetch
with these signatures, it works with any AI that follows the pattern.
OpenAI documented what works for their deep research integration. Other platforms are likely to adopt compatible patterns. Design your MCP server with search
and fetch
, and you're building for an ecosystem, not just one AI platform.
Key Takeaways
- MCP is Anthropic's protocol: Anthropic created MCP in November 2024; OpenAI adopted it in March 2025 and documented specific integration requirements
- OpenAI's pattern is simple: Two tools (
search
andfetch
) with single-string parameters create a universal interface for ChatGPT integration - Single strings push complexity server-side: Where it belongs—your server parses queries, routes IDs, and handles the details
- Foundation, not complete solution: These tools make your data source discoverable; add domain-specific tools for workflows
- Implementation varies by backend: Vector stores use semantic search, CRMs parse with LLMs, knowledge bases blend approaches
- ID design enables routing: Pack provider, resource type, and identifier into one parseable string format
- Design for the ecosystem: Implement the pattern, and your server works with ChatGPT and potentially other platforms as they adopt similar requirements
OpenAI documented their integration requirements for Anthropic's MCP protocol. Other platforms are likely to follow with compatible patterns. Build search
and fetch
into your MCP server now, and you're ready for the ecosystem that's forming.
Want to see this in action? Check out our ChatGPT integration guide, look at the field descriptions or read OpenAI's MCP integration guide and the MCP tool guide.