ChemRxiv Is Now in Valyu: Chemistry Preprints for AI Agents

ChemRxiv is now a first-class source in Valyu. Thousands of chemistry preprints - organic, materials, catalysis, computational, biochemistry - searchable in natural language through the Search and DeepResearch APIs, structured the way agents need them, with every result traced back to the primary source.

Point an agent at every chemistry preprint on ChemRxiv with one call:

Python

from valyu import Valyu
 
valyu = Valyu(api_key="val_***")
 
response = valyu.search(
    "Total synthesis of complex natural products",
    included_sources=["valyu/valyu-chemrxiv"]
)

That's it. Full text, structured the way agents need it, with every result traced back to the primary source.

Why ChemRxiv matters

ChemRxiv is the open-access preprint server for the chemical sciences, owned and run by five chemical societies (the American Chemical Society, the Royal Society of Chemistry, the German Chemical Society, the Chinese Chemical Society and the Chemical Society of Japan) and hosted on Cambridge Open Engage. It is where chemists publish first.

Preprints are the leading edge of the field. Results land on ChemRxiv months, and often years, before they appear in a peer-reviewed journal. They carry the full text, the methods, the supporting information and the datasets, and frequently the negative results and failed routes that journals quietly drop. For chemistry, materials and drug-discovery research, that is the difference between current and stale.

Most peer-reviewed chemistry sits behind a paywall. ChemRxiv is open. We indexed it for full-text semantic search, so an agent can retrieve the whole paper, not just an abstract, and reason over it. Searchable in plain English.

What's included

Thousands of chemistry preprints spanning organic, inorganic, physical, analytical and computational chemistry, materials science, catalysis, biochemistry, polymers and energy
Full-text multimodal retrieval - we process the complete paper, figures and equations included, not just the abstract
Structured metadata on every result: title, authors, DOI, and publication date
New preprints indexed as they are posted

We index the openly and commercially licensed preprints on ChemRxiv, the subset published under licences that permit commercial use and text and data mining. Everything you retrieve through Valyu is clean to build products on.

The scientific stack

ChemRxiv does not stand alone. It joins the academic sources already in Valyu, and that is where it gets powerful:

arXiv - physics, CS, machine learning, maths
PubMed - biomedical literature
bioRxiv - biology preprints
medRxiv - clinical and health preprints
ChemRxiv - chemistry and materials preprints

arXiv, PubMed, bioRxiv and medRxiv gave agents physics, machine learning and biology. ChemRxiv adds the chemistry layer that sits between them. A single agent can now move from a biological target to a new method to a synthesis route in one research run, across preprints that are months ahead of the journals. See our full data coverage at docs.valyu.ai.

What you can build

Synthesis route planning. Retrosynthesis agents that check proposed routes against the newest published reactions, catalysts and yields before committing lab time.
Reaction-condition optimisation. Agents that mine recent preprints for solvent, catalyst and temperature precedent to design a high-yield screen instead of cold-starting.
Drug discovery. Medicinal-chemistry copilots pulling structure-activity relationships and analog series to inform lead optimisation, cross-referenced with biological targets from PubMed and bioRxiv and compound bioactivity from ChEMBL.
Materials discovery. Agents surveying the latest battery, catalyst and polymer preprints to propose and rank candidate materials.
Literature monitoring. Copilots that watch ChemRxiv and brief the team on new work in a target area the moment it is posted, months before peer review.
Novelty and freedom-to-operate. Agents that cross-reference a proposed compound or route against both preprints and patents to flag prior art early.

This is not hypothetical - teams are already wiring Valyu into research agents. Here is one developer's writeup on the lessons from building AI agents for biomedical research.

How it works

ChemRxiv is available across both the Search API and the DeepResearch API. In search, there are two ways to reach it.

Automatic routing. Use the standard search endpoint. When a query involves chemistry, we route to ChemRxiv alongside other relevant sources.

Dedicated search. Target ChemRxiv directly by passing it as a source, as in the example above.

Run a full research workflow

Search is one call. For multi-step research, point DeepResearch at ChemRxiv with a research strategy and it will plan, search and synthesise across ChemRxiv and your other sources, then return a cited report.

Python

from valyu import Valyu
 
valyu = Valyu(api_key="val_***")
 
task = valyu.deepresearch.create(
    query=(
        "What are the most promising solid-state lithium battery electrolytes "
        "reported recently, and the synthesis routes and conductivities for the "
        "leading candidates?"
    ),
    mode="standard",
    research_strategy=(
        "Prioritise the latest chemistry and materials preprints from ChemRxiv for novel "
        "electrolytes, synthesis routes and reported conductivities. Cross-reference methods "
        "and characterisation against arXiv and the biomedical literature in PubMed. Favour "
        "primary sources from the last 18 months, and cite every claim back to its source."
    ),
)
 
result = valyu.deepresearch.wait(task.deepresearch_id)
print(result.output)

The agent prioritises the newest ChemRxiv preprints, cross-references arXiv and PubMed, and hands back a structured report with every claim traced to its source. No retrieval pipeline to build, no datastore to host.

Works everywhere you build

REST:

Shell

curl https://api.valyu.ai/v1/search \
  -H "Content-Type: application/json" \
  -H "x-api-key: $VALYU_API_KEY" \
  -d '{
    "query": "solid-state electrolyte preprints for lithium batteries",
    "included_sources": ["valyu/valyu-chemrxiv"]
  }'

Remote MCP, connect any MCP client:

Shell

https://mcp.valyu.ai/mcp?valyuApiKey=your-key

Valyu's search tools also plug straight into the Vercel AI SDK, so you can give an agent ChemRxiv in a couple of lines.

Get started

Playground: platform.valyu.ai
API key: platform.valyu.ai ($10 in free credits, no credit card required)
Docs: docs.valyu.ai

Valyu builds the search layer for AI in knowledge work. Our Search and DeepResearch APIs connect agents to authoritative sources across domains - web content, academic papers, chemistry and biomedical preprints, patents, financial filings and clinical trials - through one natural language interface, with every result traced back to the primary source.

ChemRxiv Is Now in Valyu: Chemistry Preprints for AI Agents

Why ChemRxiv matters

What's included

The scientific stack

What you can build

How it works

Run a full research workflow

Works everywhere you build

Get started

Related Blogs

Introducing DeepResearch

CASE STUDY: How Storke uses Valyu to cut clinical evaluation timelines in half

How to Integrate PubMed Papers into Your AI (Complete 2025 Guide)