How I built an AI-active Gmail inbox with real context + personalization (without Google’s AI)
I’ve been really interested in making AI actually useful in email — not “generic smart replies,” but an inbox where the right messages reliably turn into drafts that sound like me, reflect reality, and include the context I’d pull manually if I had time.
The catch: the context I need rarely lives inside Gmail.
It lives in:
- HubSpot (who is this person, what’s the account history, are they VIP, what did we promise?)
- Stripe (are they a customer, what plan, what happened with billing?)
- Postgres (internal source of truth: flags, entitlements, state)
- This repo (docs, runbooks, decisions, roadmaps, patterns)
So I built an AI-active Gmail inbox: Gmail stays the event source, but the agent runs inside a GitHub repo with MCP tools that can pull real context and draft replies that are personalized, accurate, and safe.
I did this without relying on Google’s AI. Gmail is plumbing (labels + push notifications). The intelligence and guardrails are mine.
Three non-negotiable constraints guided the build:
- No polling: I want push-driven events, not a cron job hammering APIs.
- No auto-send: drafts only; I approve every message.
- No inbox-only context: the agent must be able to query the systems that actually matter.
What I wanted was a workflow where Claude Code runs inside a GitHub repo (so it can read everything I’ve written and shipped), connects to MCP tools (so it can pull live customer context), drafts a reply in Gmail, and then pings me for approval. No surprises. No polling. No hallucinated promises.
We actually looked at building this with our Fastmail MCP first. I love Fastmail, but their API didn't give us a clean event trigger we could hook into without polling constantly. I hate polling. It feels messy and wasteful.
Gmail, however, has a push notification system that talks to Google Cloud Pub/Sub. That’s the hook.
Here's the war story of how I built a real-time, event-driven pipeline that turns "new important email" into "draft reply ready," with real context + personalization — and the mistakes I had to fix along the way.

The 10,000-foot view
The final flow looks simple, but I didn’t get it right on the first try:
- Gmail receives an email. If it matches a specific filter, it applies a label:
AI_TRIGGER. - Google Cloud Pub/Sub gets a push notification from Gmail saying “history changed for this label.”
- Cloud Functions (Gen2) wakes up, translates that history into concrete Gmail
messageIds, and only then fires aworkflow_dispatchevent to GitHub. - GitHub Actions spins up, runs Claude Code, and connects to the MCPBundles Hub MCP (
https://mcp.mcpbundles.com/hub). My MCPBundles account is already configured with access to a range of SaaS tools (HubSpot, Stripe, etc.), and it’s already authenticated and ready to go with the right context. - Claude uses the Gmail MCP, reads each email by
messageId, drafts a reply in Gmail, and creates a GitHub issue tagged for me with a direct link to that draft.
It’s fast. It’s serverless. And it keeps the heavy lifting (the AI reasoning + tool calls + policy) inside the GitHub Action where I have control and an audit trail.
Why “without Google’s AI”?
Google’s AI features live inside the inbox. That’s not where my best answers come from.
My best answers come from:
- Cross-system context: CRM history, billing state, internal source-of-truth data, and repo docs/runbooks.
- A controlled execution environment: a repo-run agent can read my project context deterministically and follow explicit permissions.
- Safety by design: drafts only, least-privilege tool access, and a clear separation between “signal” (Gmail) and “reasoning” (GitHub Action).
So I use Gmail for what it’s great at (mail + filters + push notifications) and keep the intelligence in my stack.
What “personalization” means in practice
I don’t mean “sprinkle in the sender’s name.” I mean the draft changes based on real state:
- Billing-aware: if Stripe says a trial expired yesterday, the draft includes the exact next step and avoids “we can extend it” unless that’s allowed.
- Account-aware: if HubSpot shows renewal is imminent or the account is high-touch, the draft escalates tone and next actions.
- Product-realistic: if Postgres says a feature flag is off, the draft doesn’t promise functionality; it offers the right workaround.
This is why the agent needs access to tools — and why inbox-only AI wasn’t the answer for me.
Safety first: treat email as hostile input
Email is untrusted input. People can (and will) paste instructions that try to hijack an agent: "ignore your previous instructions," "export secrets," "run this code," etc.

That's why this system:
- Never auto-sends
- Runs with explicit permissions
- Uses tooling guardrails (MCP tools, not arbitrary network access)
- Keeps the trigger side dumb (no secret-rich reasoning in Pub/Sub handlers)
Attempt 1 (naive): “Just send historyId to GitHub”
Gmail push notifications are great: they’re small and privacy-friendly. They don’t include the email body — just the email address and a historyId.
So my first connector was simple: decode Pub/Sub, dispatch GitHub with {emailAddress, historyId, publishedAt} and let Claude sort it out.
It worked… until it didn’t.
What went wrong
- One email ≠ one historyId.
historyIdis a cursor for mailbox changes. Multiple changes can happen quickly, and Gmail will happily emit multiple historyIds close together. - That meant I'd sometimes see multiple GitHub workflow runs for what "felt" like one email event.
- The runner then had to do extra work (list history, pick the right message, etc.), and on bad days the agent hit limits (max-turns) before finishing.

I didn't want "AI woke up" to mean "AI woke up for mailbox churn."
Attempt 2: “Only watch AI_TRIGGER”
I didn’t want the AI waking up for every newsletter or spam message. That gets expensive fast.
Instead of watching the whole INBOX, I set up a Gmail filter to apply a label—let’s call it AI_TRIGGER—to the emails that matter.
So the Cloud Function renews a watch on that specific label ID.
Two key lessons here:
- Gmail watches filter by label ID, not label name.
- If you accidentally default to
INBOX, you’ve basically opted back into “wake up for everything.”
# Cloud Function (Gen2): renew Gmail watch (labelId-only)
def handle(request):
# ... auth setup ...
label_ids = ["Label_AI_TRIGGER_ID"] # label ID for AI_TRIGGER (not the name)
# Tell Gmail: "Only bug me about this label"
body = {
"topicName": "projects/YOUR_GCP_PROJECT/topics/YOUR_TOPIC",
"labelIds": label_ids,
"labelFilterBehavior": "include"
}
# Call users.watch
requests.post(..., json=body)
We run this renewer once a day via Cloud Scheduler because Gmail watches expire after 7 days. Set it and forget it.
What went wrong (still)
Even with label scoping, the push notification is still “history changed.” It’s not “here’s the new message.”
So I still saw bursts: a few different historyIds, close together, for the same general moment in time.
Attempt 3 (final): Gate in GCP, dispatch messageIds to GitHub
The fix was to keep the trigger side “dumb” but not blind:
- Use the Gmail History API once (cheap) to translate history into concrete message IDs.
- Persist minimal state so I don’t re-trigger for the same mailbox range.
This is the smallest unit that makes the GitHub runner clean and deterministic.
My "notify" function does one job: take that signal and kick off the GitHub Action.
I learned this the hard way: make this function fire-and-forget.
At first, I had it raise an error if the GitHub API failed (like a 404 or rate limit). Bad idea. Pub/Sub saw the error, didn't ack the message, and retried. And retried. And retried. I woke up to hundreds of triggered workflows from a single email.
Now, it catches errors, logs them, and returns "ok" no matter what.
# Cloud Function (Gen2): Pub/Sub → messageIds → GitHub workflow dispatch
# (illustrative; names/paths simplified)
def handle(event):
try:
payload = decode_pubsub(event) # { emailAddress, historyId, ... }
incoming_history_id = payload["historyId"]
# 1) Read last processed cursor from GCS
cursor = gcs_read_json("gs://YOUR_BUCKET/gmail/processed_cursor.json").get("processed_history_id")
if cursor is None:
# initialize to avoid backfill on the first ever event
gcs_write_json("gs://YOUR_BUCKET/gmail/processed_cursor.json", {"processed_history_id": incoming_history_id})
return "ok"
# 2) (Optional but helpful) short lease to coalesce bursts
if not acquire_lease("gs://YOUR_BUCKET/gmail/dispatch_lease.json", ttl_seconds=90):
return "ok"
# 3) Ask Gmail: “any messages added since cursor, for AI_TRIGGER label?”
message_ids, new_cursor = gmail_history_list(
start_history_id=cursor,
history_types=["messageAdded"],
label_id="Label_AI_TRIGGER_ID",
)
# Advance cursor so we don’t retrigger for the same mailbox range
gcs_write_json("gs://YOUR_BUCKET/gmail/processed_cursor.json", {"processed_history_id": new_cursor})
if not message_ids:
release_lease(...)
return "ok"
# Trigger GitHub Workflow
dispatch_workflow_with_pat(
# ...
inputs={
"emailAddress": email_address,
"messageIds": ",".join(message_ids),
"publishedAt": published_at
}
)
release_lease(...)
return "ok"
except Exception as e:
# Log it, but DON'T crash. Stop the retry loop.
print(f"Error: {e}")
return "ok"
Step 3: The Agent (Claude Code + MCP)
This is where it gets fun.
The GitHub Action now receives messageIds. It still doesn’t have the email content (privacy win), but it has stable identifiers that map directly to Gmail messages.
It starts Claude Code and connects it to MCPBundles over HTTPS. This is key: Claude is running inside the repo, so it can read code, docs, and blog posts (yes, even the ones you forgot you wrote). That’s where the non-inbox context and the “voice” of your product actually live.
Then MCPBundles fills in the gaps with read access to the systems that matter for email replies:
- HubSpot (who is this person, what account are they on, what’s the history?)
- Stripe (are they a customer, what plan, what’s going on?)
- Postgres (internal state, feature flags, whatever you store as the source of truth)
In this setup, Claude Code is running inside GitHub Actions, not in my local terminal. So the MCP setup needs to happen inside the workflow.
MCP setup (what I ended up doing)
My first version wrote an MCP config file in the runner. It worked, but it was unnecessary complexity for this setup.
Then I hit a stability lesson with anthropics/claude-code-action@v1: the most robust path is to put your Claude Code configuration into settings, not a pile of CLI flags. It matches how Claude Code is configured when you run it inside a repo, and it’s more flexible as your setup grows.
So the cleaner approach was:
- Put the MCP credential in an environment variable:
MCPBUNDLES_API_KEY - Tell Claude Code Action to enable project MCP servers
Example snippet (inside your workflow job), with placeholders only:
- name: Claude Code (automation)
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
settings: |
{
"env": {
"MCPBUNDLES_API_KEY": "${{ secrets.MCPBUNDLES_API_KEY }}"
},
"enableAllProjectMcpServers": true
}
prompt: |
...
claude_args: |
--max-turns 20
GitHub issue creation: you must allow gh issue create
Another “learned the hard way” detail: Claude can only create issues if the action allows it to run the GitHub CLI command. This is separate from GitHub Actions job permissions.
In practice, you need both:
- Job permissions to write issues (
issues: write) - Claude Code
settings.permissions.allowto includeBash(gh issue create:*)
Minimal example:
permissions:
issues: write
# ...
with:
settings: |
{
"permissions": {
"allow": [
"Read",
"Bash(gh issue create:*)",
"Bash(gh issue list:*)"
]
}
}
The prompt (messageIds, one-by-one)
The prompt now tells Claude exactly what to do, deterministically:
Claude handles the rest. It:
- Splits
messageIdsinto a list. - For each
messageId, fetches the email content via Gmail tools. - Creates a Gmail draft reply.
- Opens a GitHub issue tagging me with a link to that draft.
That GitHub issue is the human-in-the-loop checkpoint: it’s where I see the proposed reply, the supporting context the agent used (customer status, account notes, internal state), and the exact link to the Gmail draft.
Why GitHub Actions?
The real reason is simple: Claude Code is running inside my repo.
That means it has read access to the stuff that makes an email response actually good:
- The codebase
- Docs and runbooks
- Context about the product and roadmap
- Patterns from past issues and PRs
Then MCPBundles fills in the rest. If the person emailing me is a customer, Claude can use tools to pull the right context (Gmail content, CRM notes, internal data) without me wiring a custom integration per system.
GitHub Actions is just the execution environment that makes this easy: it runs close to the repo, gives me an audit trail, and keeps the trigger side (GCP) dumb and cheap.
The Result
I receive an email. Five seconds later, a Pub/Sub message fires. Thirty seconds later, a GitHub Action spins up. A minute later, I get a ping on GitHub: "Draft response ready for: [Subject]".
It’s slick. It cuts out the noise. And it forces me to focus only on what actually needs a human touch, with the busywork already done.
If you're building with MCP, think about this pattern: keep your triggers dumb and your agents smart. Let Gmail pass the baton, keep the reasoning in a controlled environment, and make “personalization” mean real context — not just nicer words.