
ChemRxiv is now a first-class source in Valyu. Thousands of chemistry preprints - organic, materials, catalysis, computational, biochemistry - searchable in natural language through the Search and DeepResearch APIs, structured the way agents need them, with every result traced back to the primary source.
Point an agent at every chemistry preprint on ChemRxiv with one call:
That's it. Full text, structured the way agents need it, with every result traced back to the primary source.
Why ChemRxiv matters
ChemRxiv is the open-access preprint server for the chemical sciences, owned and run by five chemical societies (the American Chemical Society, the Royal Society of Chemistry, the German Chemical Society, the Chinese Chemical Society and the Chemical Society of Japan) and hosted on Cambridge Open Engage. It is where chemists publish first.
Preprints are the leading edge of the field. Results land on ChemRxiv months, and often years, before they appear in a peer-reviewed journal. They carry the full text, the methods, the supporting information and the datasets, and frequently the negative results and failed routes that journals quietly drop. For chemistry, materials and drug-discovery research, that is the difference between current and stale.
Most peer-reviewed chemistry sits behind a paywall. ChemRxiv is open. We indexed it for full-text semantic search, so an agent can retrieve the whole paper, not just an abstract, and reason over it. Searchable in plain English.
What's included
- Thousands of chemistry preprints spanning organic, inorganic, physical, analytical and computational chemistry, materials science, catalysis, biochemistry, polymers and energy
- Full-text multimodal retrieval - we process the complete paper, figures and equations included, not just the abstract
- Structured metadata on every result: title, authors, DOI, and publication date
- New preprints indexed as they are posted
We index the openly and commercially licensed preprints on ChemRxiv, the subset published under licences that permit commercial use and text and data mining. Everything you retrieve through Valyu is clean to build products on.
The scientific stack
ChemRxiv does not stand alone. It joins the academic sources already in Valyu, and that is where it gets powerful:
- arXiv - physics, CS, machine learning, maths
- PubMed - biomedical literature
- bioRxiv - biology preprints
- medRxiv - clinical and health preprints
- ChemRxiv - chemistry and materials preprints
arXiv, PubMed, bioRxiv and medRxiv gave agents physics, machine learning and biology. ChemRxiv adds the chemistry layer that sits between them. A single agent can now move from a biological target to a new method to a synthesis route in one research run, across preprints that are months ahead of the journals. See our full data coverage at docs.valyu.ai.
What you can build
- Synthesis route planning. Retrosynthesis agents that check proposed routes against the newest published reactions, catalysts and yields before committing lab time.
- Reaction-condition optimisation. Agents that mine recent preprints for solvent, catalyst and temperature precedent to design a high-yield screen instead of cold-starting.
- Drug discovery. Medicinal-chemistry copilots pulling structure-activity relationships and analog series to inform lead optimisation, cross-referenced with biological targets from PubMed and bioRxiv and compound bioactivity from ChEMBL.
- Materials discovery. Agents surveying the latest battery, catalyst and polymer preprints to propose and rank candidate materials.
- Literature monitoring. Copilots that watch ChemRxiv and brief the team on new work in a target area the moment it is posted, months before peer review.
- Novelty and freedom-to-operate. Agents that cross-reference a proposed compound or route against both preprints and patents to flag prior art early.
This is not hypothetical - teams are already wiring Valyu into research agents. Here is one developer's writeup on the lessons from building AI agents for biomedical research.
How it works
ChemRxiv is available across both the Search API and the DeepResearch API. In search, there are two ways to reach it.
Automatic routing. Use the standard search endpoint. When a query involves chemistry, we route to ChemRxiv alongside other relevant sources.
Dedicated search. Target ChemRxiv directly by passing it as a source, as in the example above.
Run a full research workflow
Search is one call. For multi-step research, point DeepResearch at ChemRxiv with a research strategy and it will plan, search and synthesise across ChemRxiv and your other sources, then return a cited report.
The agent prioritises the newest ChemRxiv preprints, cross-references arXiv and PubMed, and hands back a structured report with every claim traced to its source. No retrieval pipeline to build, no datastore to host.
Works everywhere you build
REST:
Remote MCP, connect any MCP client:
Valyu's search tools also plug straight into the Vercel AI SDK, so you can give an agent ChemRxiv in a couple of lines.
Get started
- Playground: platform.valyu.ai
- API key: platform.valyu.ai ($10 in free credits, no credit card required)
- Docs: docs.valyu.ai
Valyu builds the search layer for AI in knowledge work. Our Search and DeepResearch APIs connect agents to authoritative sources across domains - web content, academic papers, chemistry and biomedical preprints, patents, financial filings and clinical trials - through one natural language interface, with every result traced back to the primary source.