Reducto Studio (YC W24) – Build accurate document pipelines, fast

Read Reducto Studio's YC W24 launch story, originally shared on Hacker News.

Read Post

Hi HN! We’re Adit and Raunak, co-founders of Reducto (YC W24, https://reducto.ai). Reducto turns unstructured documents (e.g., PDFs, scans, spreadsheets) into structured data. This data can then be used for retrieval, passed into LLMs, or used elsewhere downstream.

We started Reducto when we realized that so many of today’s AI applications require good quality data. Everyone knows that good inputs lead to better outputs, but 80% of the world’s data is still trapped inside of things like messy PDFs and spreadsheets. Raunak and I launched a really early MVP of parsing and extracting from unstructured documents, and were lucky to have a lot of interest from technical teams when they realized that the accuracy was something they hadn’t seen before.

We started by just releasing an API for engineers to build with, but over time we realized that an accurate API was only part of the puzzle. Our customers wanted to be able to easily set up multi step pipelines, evaluate and iterate on performance within their use case, and work with non-engineering teammates that were also involved in the real world document processing flow.

That’s why we’re launching Reducto Studio, a web platform that sits on top of our APIs for users to build and iterate on end-to-end document pipelines.

With Studio, you can:

- Drop an entire file set and get per-field and per-document accuracy scores against your eval data.

- Auto-generate and continuously optimize extraction schemas to hit production-grade quality fast.

- Save every run, iterate on parse/extract configs, and compare results side-by-side.

You can see some examples here (https://studio.reducto.ai) or you can watch this walkthrough: https://www.loom.com/share/b243551741c642c6a594c00353fcecb3.

If you’d like to upload your own document you can log in and do so as well - we don’t make you book a demo or put a payment down to try it.

Thanks for reading and checking it out! This is only the first step for Studio, so we’d love feedback on anything: UX rough edges (we know they’re there!), features that would make evaluations better for you, hard documents you’ve had trouble with, or anything else about wrangling with unstructured data.

Reducto Studio (YC W24) – Build accurate document pipelines, fast

Get Top 5 Posts of the Week