Skip to main content

arXiv MCP Server: Search Papers, Read Abstracts & Full PDFs with AI (4 Tools, No Auth)

· 8 min read
MCPBundles

arXiv MCP Server

arXiv hosts more than 2.4 million papers across physics, mathematics, computer science, quantitative biology, and more. Researchers, ML engineers, and academics increasingly want the same thing from their AI stack: search papers, read abstracts, follow citations, track new work in a field, and go deeper into full text — all without leaving the chat.

arXiv does not ship an official MCP server. Community implementations exist on GitHub, but they vary in quality, transport, and maintenance. MCPBundles hosts a dedicated arXiv provider (arxiv) with 4 MCP tools backed by the official arXiv Atom API — no API key required, no self-hosting, no config files. Enable the bundle, connect your AI client, and start searching.

What Your AI Can Do

Search & discovery

ToolWhat it does
arxiv_search_papersAdvanced search with arXiv query syntax — keywords, authors, titles, categories, date ranges, boolean operators. Sort by relevance, submission date, or last updated. Paginate with offset. Returns up to 100 results per call.
arxiv_get_paperFetch a single paper by arXiv ID (e.g. 2402.05964 or 2402.05964v2). Returns title, authors, abstract, categories, primary category, DOI, comment, journal ref, arXiv URL, and PDF URL.
arxiv_get_papersBatch fetch up to 20 papers in a single request by comma-separated IDs. Reports which IDs were found and which are missing. Dramatically faster than sequential single calls given arXiv's 3-second rate limit between requests.
arxiv_read_paperDownload and return the full paper as a PDF for the AI to read. Returns the PDF as an embedded resource that supporting clients can render or parse directly — not just a link, the actual document.

Query syntax (the operators agents use most)

arXiv's search syntax is what makes the search tool powerful beyond simple keywords:

  • All fields: all:transformer — searches title, abstract, authors, comments
  • Title: ti:"attention is all you need" — exact phrase in title
  • Author: au:"Vaswani" — author name search
  • Category: cat:cs.CL — restrict to a subject class
  • Abstract: abs:"reinforcement learning" — abstract-only search
  • Boolean: AND, OR, ANDNOT — combine clauses
  • Date range: submittedDate:[20240101 TO 20240331] — filter by submission window

Combine operators: e.g. cat:cs.AI AND ti:"large language model" AND submittedDate:[20250101 TO 20260401] for targeted discovery.

Agents that use the cat: filter should know these common labels:

  • cs.AI — Artificial Intelligence
  • cs.CL — Computation and Language (NLP)
  • cs.CV — Computer Vision and Pattern Recognition
  • cs.LG — Machine Learning
  • stat.ML — Machine Learning (Statistics)
  • cs.CR — Cryptography and Security
  • cs.SE — Software Engineering
  • math.OC — Optimization and Control
  • q-fin — Quantitative Finance

These classes appear constantly in LLM, RLHF, vision, and systems papers — good defaults for filters and alerts.

Real Workflows

"Find the latest papers on RLHF from the last month."

Your agent runs arxiv_search_papers with all:RLHF AND cat:cs.CL, sort_by: submittedDate, and a submittedDate range. Returns titles, authors, abstracts, and PDF links. Summarize the top results into a digest without leaving the chat.

"Read the full text of arXiv:2402.05964 and summarize the key contributions."

arxiv_get_paper to pull metadata and abstract, then arxiv_read_paper to fetch the complete PDF. The AI reads the paper inline and provides a structured summary — not just the abstract, the full content.

"Compare these three papers on diffusion models."

arxiv_get_papers with 2402.05964,2302.14017,2301.00001 fetches all three in a single request. The AI gets titles, abstracts, authors, and categories side by side for a comparative analysis without three separate calls and rate-limit waits.

"Track new papers from a specific research group this quarter."

arxiv_search_papers with au:"Hinton" and submittedDate:[20260101 TO 20260401], sorted by submission date. The agent returns the latest work and can cross-reference with previous results you've saved.

"Literature review on transformer architectures since 2023."

Run arxiv_search_papers for architecture keywords and recent dates, then for each shortlisted paper call arxiv_read_paper to go beyond abstracts. Pair with Semantic Scholar on MCPBundles for citations and related work to widen the review systematically.

Complementary Research Tools on MCPBundles

arXiv gives you the preprint corpus. For a complete research workflow, pair it with other hosted providers:

  • Semantic Scholar — 200M+ papers with citation graphs, author profiles, and recommendations. Enrich an arXiv hit with citation counts and related work.
  • Europe PMC — Biomedical literature when your thread crosses from CS preprints into life sciences.
  • Zenodo — Open-access research outputs, datasets, and software-linked publications for artifacts and supplementary material.

Practical combo: search on arXiv, pull citations through Semantic Scholar, and retrieve datasets from Zenodo — all in the same conversation, one MCPBundles hub.

How It Works

  1. Enable the arXiv bundle on MCPBundles
  2. No credentials needed — arXiv's API is public. Just enable the bundle and you're ready.
  3. Connect your AI client (Claude, ChatGPT, Cursor, or any MCP host) to your MCPBundles workspace
  4. Ask your agent in natural language — e.g. "Search arXiv for papers on MCP and tool use in LLMs, sort by most recent"

arXiv enforces rate limits of roughly 1 request per 3 seconds. MCPBundles respects these limits. For bulk lookups, use arxiv_get_papers (batch up to 20) instead of sequential single calls.

Key Concepts

  • No auth required — arXiv's API is free and public. No API key, no OAuth, no signup. Enable the bundle and go.
  • Full PDF reading — Not just metadata and abstracts. arxiv_read_paper delivers the actual PDF as an embedded resource your AI can read.
  • Batch efficiencyarxiv_get_papers fetches up to 20 papers in one call, avoiding the 3-second rate limit penalty between sequential requests.
  • Advanced query syntax — Category filters, date ranges, author search, boolean operators — the same power as the arXiv API, accessible through natural language prompts.
  • Complements citation tools — arXiv is the discovery layer. Pair with Semantic Scholar for citation graphs and impact metrics.

FAQ

Is this the official arXiv MCP?

No. arXiv does not publish an official MCP server. MCPBundles hosts a dedicated arXiv provider that calls the official arXiv Atom API — the same public API that powers third-party arXiv tools.

Do I need an API key?

No. The arXiv API is free and does not require authentication. You enable the arXiv bundle on MCPBundles and start searching immediately — no credentials to configure.

Can my AI read full papers, not just abstracts?

Yes. The arxiv_read_paper tool downloads the full PDF and returns it as an embedded resource. On clients that support PDF reading (like Claude), the AI can read and reason over the complete paper content, not just the abstract.

What about rate limits?

arXiv enforces rate limits of roughly 1 request per 3 seconds. MCPBundles handles this with built-in delay between calls. For batch lookups, use arxiv_get_papers to fetch up to 20 papers in a single request instead of calling arxiv_get_paper 20 times.

What is the difference between arXiv and Semantic Scholar MCP?

arXiv is the preprint corpus — what dropped today, what a specific author published, what's in cs.CL this month. Semantic Scholar is a broad literature graph with citations, author profiles, and recommendations across all venues. Use arXiv for discovery and full-text reading; use Semantic Scholar for citation context and related work. Both are available as hosted bundles on MCPBundles.