Advanced MCP: Streaming and Approval Gates

July 22, 2025 · 7 min read

MCPBundles

Users would ask Claude to "set up all my integrations," and Claude would call our provisioning tool. Then nothing. Users waited 45 seconds staring at a spinner while our server created API keys, configured webhooks, and set up OAuth clients. Most users gave up after 15 seconds, thinking it failed.

Then someone asked Claude to "clean up old bundles," and Claude dutifully deleted everything from the last 6 months. Because we let it.

We needed streaming for long-running work, chunking for large operations, and approval gates for anything scary. Here's what works.

The Problem: Tools That Take Too Long

Our "provision bundle" tool took 30-45 seconds. It had to:

Create API credentials (5-8s)
Set up webhooks (8-12s)
Configure OAuth (10-15s)
Validate everything (5-10s)

Users saw nothing for 45 seconds. They'd refresh the page, creating duplicate requests. Our database filled with half-configured bundles because users killed the process mid-flight.

Simple loading spinners didn't help—users needed to see progress or they assumed it broke.

Streaming: Show Progress, Keep Users Engaged

Streaming lets you send updates during execution. Instead of waiting 45 seconds for a final response, send status updates every few seconds.

Basic streaming example:

@mcp.tool(description="Provision a new bundle with progress updates")
async def provision_bundle(bundle_id: str) -> dict:
    yield {"status": "starting", "message": "Creating API credentials..."}
    await asyncio.sleep(5)
    
    yield {"status": "progress", "message": "Setting up webhooks...", "percent": 33}
    await asyncio.sleep(8)
    
    yield {"status": "progress", "message": "Configuring OAuth...", "percent": 66}
    await asyncio.sleep(10)
    
    yield {"status": "progress", "message": "Validating configuration...", "percent": 90}
    await asyncio.sleep(5)
    
    yield {"status": "complete", "message": "Bundle provisioned successfully", "bundle_id": bundle_id}

Now users see:

"Creating API credentials..." (immediately)
"Setting up webhooks... 33%" (after 5s)
"Configuring OAuth... 66%" (after 13s)
"Validating configuration... 90%" (after 23s)
"Bundle provisioned successfully" (after 28s)

What changed:

Abandonment dropped from 40% to under 5%
Support tickets about "broken provisioning" went to zero
Users waited longer because they saw progress

Streaming tips:

Send updates every 2-5 seconds, not every 100ms. Too many updates slow down the UI.

Include percent complete when you can estimate it:

yield {"status": "progress", "percent": 33, "message": "Step 2 of 4"}

Send actual status, not generic messages:

# ✗ Vague
yield {"status": "working", "message": "Processing..."}

# ✓ Specific
yield {"status": "creating_webhooks", "message": "Registering 3 webhook endpoints"}

Chunking: Break Large Operations Into Pieces

We had a tool to "sync all provider credentials" that would check 200+ OAuth tokens. It took 90 seconds and timed out half the time. If it failed at token #195, it started over from token #1.

We chunked it:

@mcp.tool(description="Sync provider credentials in batches")
async def sync_credentials(
    cursor: str | None = None,
    batch_size: int = 20
) -> dict:
    # Get the starting point
    start_index = int(cursor) if cursor else 0
    
    # Process one batch
    tokens = get_tokens(start_index, batch_size)
    results = []
    
    for token in tokens:
        result = await refresh_token(token)
        results.append(result)
    
    # Calculate next cursor
    next_index = start_index + len(tokens)
    has_more = next_index < total_token_count()
    next_cursor = str(next_index) if has_more else None
    
    return {
        "results": results,
        "processed": len(results),
        "next_cursor": next_cursor,
        "has_more": has_more
    }

Now Claude calls it like this:

sync_credentials() → returns next_cursor="20"
sync_credentials(cursor="20") → returns next_cursor="40"
Continues until has_more=false

What this fixed:

Timeouts disappeared. Each chunk takes 5-8 seconds instead of 90.
Failures are cheap. If chunk #7 fails, retry chunk #7, not all 200 tokens.
Progress is visible. Claude tells users "Synced 60 of 200 credentials..."

Chunking tips:

Make chunks idempotent. If Claude calls the same cursor twice, it should be safe:

# Store processed IDs to avoid duplicates
processed = set(get_processed_ids())
tokens_to_process = [t for t in tokens if t.id not in processed]

Keep chunk size reasonable—20-50 items is usually right. Too small means lots of calls, too large risks timeouts.

Return clear completion signals:

return {
    "results": [...],
    "next_cursor": "page_3" if has_more else None,
    "progress": {"completed": 60, "total": 200, "percent": 30}
}

Store progress server-side so users can resume after closing the tab:

# Save state
await db.save_sync_state(user_id, {"last_cursor": next_cursor})

# Resume later
last_state = await db.get_sync_state(user_id)
cursor = last_state.get("last_cursor") if last_state else None

Human-in-the-Loop: Don't Let Claude Delete Everything

Someone asked Claude to "clean up my old bundles." Claude called delete_bundle for every bundle older than 6 months. Gone. All of them. Including production ones the user needed.

We added approval gates:

@mcp.tool(description="Delete a bundle (requires approval)")
async def delete_bundle(bundle_id: str) -> dict:
    # Get bundle details
    bundle = await get_bundle(bundle_id)
    
    # Request approval
    approved = await request_approval({
        "action": "delete_bundle",
        "bundle_name": bundle.name,
        "bundle_id": bundle_id,
        "warning": "This action cannot be undone",
        "details": {
            "created": bundle.created_at,
            "tools_count": len(bundle.tools),
            "last_used": bundle.last_used_at
        }
    })
    
    if not approved:
        return {"status": "cancelled", "message": "User declined deletion"}
    
    # User approved, proceed
    await actually_delete_bundle(bundle_id)
    return {"status": "deleted", "bundle_id": bundle_id}

Now when Claude wants to delete something, users see:

Claude wants to delete bundle "Production Slack Integration"

Details:
  - Created: 2024-03-15
  - Tools: 12
  - Last used: 2 hours ago

WARNING: This action cannot be undone

[Cancel] [Approve]

What we require approval for:

Deleting anything
Revoking credentials or API keys
Changing production settings
Spending money (upgrading plans, purchasing add-ons)
Sharing data externally

Approval tips:

Show a clear diff or summary:

approval_payload = {
    "action": "update_webhook_url",
    "changes": {
        "old_url": "https://old.example.com/webhook",
        "new_url": "https://new.example.com/webhook"
    },
    "impact": "Webhooks will be sent to the new URL immediately"
}

Make the default safe. If approval fails or times out, cancel the operation.

Log approvals for audit:

if approved:
    await log_audit_event({
        "user_id": user_id,
        "action": "delete_bundle",
        "bundle_id": bundle_id,
        "approved_at": datetime.now(),
        "approved_by": approval_response.user_id
    })

Don't ask for approval on read-only operations. Only destructive or expensive ones.

Combining Patterns: Streaming + Chunking + Approval

For really complex operations, combine all three:

@mcp.tool(description="Migrate all bundles to new infrastructure")
async def migrate_bundles() -> dict:
    bundles = await get_all_bundles()
    
    # Request upfront approval
    approved = await request_approval({
        "action": "migrate_all_bundles",
        "count": len(bundles),
        "warning": "This will temporarily disable bundles during migration",
        "estimated_time": f"{len(bundles) * 5} seconds"
    })
    
    if not approved:
        return {"status": "cancelled"}
    
    # Stream progress while chunking
    for i in range(0, len(bundles), 10):
        chunk = bundles[i:i+10]
        
        yield {
            "status": "migrating",
            "message": f"Migrating bundles {i+1} to {i+len(chunk)}",
            "percent": int((i / len(bundles)) * 100)
        }
        
        for bundle in chunk:
            await migrate_single_bundle(bundle)
        
        await asyncio.sleep(1)  # Rate limiting
    
    yield {"status": "complete", "message": f"Migrated {len(bundles)} bundles"}

Users see:

Approval request with details
Progress updates every 10 bundles
Final completion message

When Each Pattern Matters

Use streaming when:

Operations take > 5 seconds
You can show meaningful progress
Users need reassurance it's working

Use chunking when:

Processing > 50 items
Individual items can fail independently
Operations might timeout
You need resumability

Use approval gates when:

Actions are destructive or irreversible
Operations cost money
Changes affect production systems
Multiple items will be modified

Don't overuse them:

Streaming a 500ms operation is overkill
Chunking 5 items is unnecessary
Approval for reading data is annoying

Key Takeaways

Users give up after 15 seconds without feedback—stream progress for long operations
Streaming every 2-5 seconds is ideal—too frequent slows down the UI
Chunk operations over 50 items and return cursors for resuming
Make chunks idempotent so retries are safe
Require approval for destructive operations—show clear diffs and warnings
Log all approvals for audit trails
Combine patterns for complex operations: approve upfront, chunk the work, stream progress
Don't overuse patterns—simple operations should stay simple

Resources

Anthropic MCP Docs – Streaming and session management
FastMCP GitHub – Streaming examples and utilities
MCP Specification – Progress notification spec

Long-running tools need progress updates. Dangerous tools need approval gates. Large operations need chunking. Use them wisely.

The Problem: Tools That Take Too Long​

Streaming: Show Progress, Keep Users Engaged​

Chunking: Break Large Operations Into Pieces​

Human-in-the-Loop: Don't Let Claude Delete Everything​

Combining Patterns: Streaming + Chunking + Approval​

When Each Pattern Matters​

Key Takeaways​

Resources​