In a quiet corner of San Francisco’s busy startup ecosystem, Reducto has quietly pulled off something ambitious: a $75 million Series B raise, pushing its total funding to an eye‑catching $108 million. The round was led by Andreessen Horowitz (a16z), with strong participation from Benchmark, First Round Capital, BoxGroup, and Y Combinator. This isn’t just another capital splash — it underscores a deeper pivot in how AI applications will handle the messy, real world of documents.
Founded by MIT‑trained engineers Adit Abraham (CEO) and Raunak Chowdhuri (CTO), Reducto is tackling one of the thorniest problems in AI today: extracting structured meaning from unstructured documents. Think PDFs with footnotes, legal redlines, embedded images, messy tables, scanned forms — all the things that break simple OCR or vanilla LLM pipelines. Reducto’s approach marries classical optical character recognition (OCR) techniques with state-of-the-art vision-language models (VLMs) to parse layouts, preserve context, and output AI‑ready structured data.
In the six months following its April 2025 Series A, Reducto’s monthly document throughput reportedly grew 6×, touching nearly a billion pages. Its client mix is revealing: a mix of AI-native startups (Harvey, Rogo, Scale AI) and heavyweight enterprise names across finance, healthcare, and logistics. The promise: free AI systems from brittle, hand‑tuned ingestion logic, and let them reason over documents as a human would.
“Documents contain some of the most valuable data in most industries … they’ve been a bottleneck for making AI useful for real enterprise use cases,” said Abraham.
“Our current customers love us for our best‑in‑class accuracy, and we intend to continue pushing the frontier of document intelligence,” added Chowdhuri.
Reducto delivers its capabilities through two complementary offerings: a developer-first API and Reducto Studio, a UI that lets builders iterate on pipelines — for tasks like splitting multipage files, extracting structured fields, or even editing inside documents. The company is also rolling out a more flexible pricing tier aimed at small startups and researchers, lowering the barrier for experimentation.
Editorial Insight & Market Outlook
Reducto isn’t just riding the AI wave — it’s addressing a foundational plumbing problem in the stack. In theory, language models are powerful, but in practice, much of enterprise value is locked in PDFs, scanned docs, and heterogeneous formats. If Reducto succeeds, it could become as essential to AI applications as data lakes or vector stores are today.
What’s striking is the timing. The AI sector is shifting from “can we build agentic systems?” to “how do we deploy them reliably in the enterprise?” That shift exposes everything that’s brittle — and data ingestion is one of the biggest pain points. Legacy tools from AWS, Google, and Microsoft are good but often brittle on messy real-world docs; Reducto claims margins of improvement in accuracy by 20+ percentage points in benchmark tasks. (Its own blog describes techniques like “agentic OCR,” a multi-pass correction loop.)
But there are real execution risks. Competing document AI plays — especially those bundled into big cloud providers — will push aggressively. To stay differentiated, Reducto will need to keep bleeding edge both in model quality and operational reliability. Scaling from high-growth AI startups to regulated enterprises (healthcare, finance) means navigating compliance, latency SLAs, and onboarding complexity.
Yet the upside is enormous. If Reducto becomes the de facto ingestion layer for AI-first enterprises, its business could scale rapidly. Moreover, given it already handles a billion‑page scale monthly, the infrastructure moat may deepen — the harder it is to replicate, the more defensible.
In short: Reducto has placed a bet not on another angle of the AI arms race, but on solving one of its most painful infrastructure problems. If it plays it right, the company may be quietly rewriting how document-centric workflows feed into intelligence.
If you need further assistance or have any corrections, please reach out to editor@thetimesmag.com