Trellis (YC W24) – AI-powered workflows for unstructured data

Hey HN — We're Jacky and Mac from Trellis (https://runtrellis.com/). We’re building AI-powered ETL for unstructured data. Trellis transforms phone calls, PDFs, and chats into structured SQL format based on any schema you define in natural language. This helps data and ops teams automate manual data entry and run SQL queries on messy data.

There’s a demo video at https://www.youtube.com/watch?v=ib3mRh2tnSo and a sandbox to try out (no sign-in required!) at https://demo.runtrellis.com/. An interesting historical archive of unstructured data we thought it would be interesting to run Trellis on top of are old Enron emails which famously took months to review. We’ve created a showcase demo here: https://demo.runtrellis.com/showcase/enron-email-analysis, with some documentation here: https://docs.runtrellis.com/docs/example-email-analytics.

Why we built this: At the Stanford AI lab where we met, we collaborated with many F500 data teams (including Amazon, Meta, and Standard Chartered), and repeatedly saw the same problem: 80% of enterprise data is unstructured, and traditional platforms can’t handle it. For example, a major commercial bank I work with couldn’t improve credit risk models because critical data was stuck in PDFs and emails.

We realized that our research from the AI lab could be turned into a solution with an abstraction layer that works as well for financial underwriting as it does for analysis of call center transcripts: an AI-powered ETL that takes in any unstructured data source and turns it into a schematically correct table.

Some interesting technical challenges we had to tackle along the way: (1) Supporting complex documents out of the box: We use LLM-based map-reduce to handle long documents and vision models for table and layout extraction. (2) Model Routing: We select the best model for each transformation to optimize cost and speed. For instance, in data extraction tasks, we could leverage simpler fine-tuned models that are specialized in returning structured JSONs of financial tables. (3) Data Validation and Schema Guarantees: We ensure accuracy with reference links and anomaly detection.

After launching Trellis, we’ve seen diverse use cases, especially in legacy industries where PDFs are treated as APIs. For example, financial services companies need to process complex documents like bonds and credit ratings into a structured format, and need to speed up underwriting and enable pass-through loan processing. Customer support and back-office operations need to accelerate onboarding by mapping documents across different schema and ERP systems, and ensure support agents follow SOPs (security questions, compliance disclosures, etc.). And many companies today want data preprocessing in ETL pipelines and data ingestion for RAG.

We’d love your feedback! Try it out at https://demo.runtrellis.com/. To save and track your large data transformations, you can visit our dashboard and create an account at https://dashboard.runtrellis.com/. If you’re interested in integrating with our APIs, our quick start docs are here: https://docs.runtrellis.com/docs/getting-started. If you have any specific use cases in mind, we’d be happy to do a custom integration and onboarding—anything for HN. :)

Excited to hear about your experience wrangling with unstructured data in the past, workflows you want to automate, and what data integration you would like to see.



Get Top 5 Posts of the Week



best of all time best of today best of yesterday best of this week best of this month best of last month best of this year best of 2023 best of 2022 yc s24 yc w24 yc s23 yc w23 yc s22 yc w22 yc s21 yc w21 yc s20 yc w20 yc s19 yc w19 yc s18 yc w18 yc all-time 3d algorithms animation android [ai] artificial-intelligence api augmented-reality big data bitcoin blockchain book bootstrap bot css c chart chess chrome extension cli command line compiler crypto covid-19 cryptography data deep learning elexir ether excel framework game git go html ios iphone java js javascript jobs kubernetes learn linux lisp mac machine-learning most successful neural net nft node optimisation parser performance privacy python raspberry pi react retro review my ruby rust saas scraper security sql tensor flow terminal travel virtual reality visualisation vue windows web3 young talents


andrey azimov by Andrey Azimov