Prequel (YC W21) – Sync data to your customer’s data warehouse

Read Post

Hey HN! We’re Conor and Charles from Prequel (https://prequel.co). We make it easy for B2B companies to send data to their customers. Specifically, we help companies sync data directly to their customer's data warehouse, on an ongoing basis.

We’re building Prequel because we think the current ETL paradigm isn’t quite right. Today, it’s still hard to get data out of SaaS tools: customers have to write custom code to scrape APIs, or procure third-party tools like Fivetran to get access to their data. In other words, the burden of data exports is on the customer.

We think this is backwards! Instead, vendors should make it seamless for their customers to export data to their data warehouse. Not only does this make the customer’s life easier, it benefits the vendor too: they now have a competitive advantage, and they get to generate new revenue if they choose to charge for the feature. This approach is becoming more popular: companies like Stripe, Segment, Heap, and most recently Salesforce offer some flavor of this capability to their customers.

However, just as it doesn’t make sense for each customer to write their own API-scraping code, it doesn’t make sense for every SaaS company to build their own sync-to-customer-warehouse system. That’s where Prequel comes in. We give SaaS companies the infrastructure they need to easily connect to their customers’ data warehouses, start writing data to it, and keep that data updated on an ongoing basis. Here's a quick demo: https://www.loom.com/share/da181d0c83e44ef9b8c5200fa850a2fd.

Prequel takes less than an hour to set up: you (the SaaS vendor) connect Prequel to your source database/warehouse, configure your data model (aka which tables to sync), and that’s pretty much it. After that, your customers can connect their database/warehouse and start receiving their data in a matter of minutes. All of this can be done through our API or in our admin UI.

Moving all this data accurately and in a timely manner is a nontrivial technical problem. We potentially have to transfer billions of rows / terabytes of data per day, while guaranteeing that transfers are completely accurate. Since companies might use this data to drive business decisions or in financial reporting, we really can't afford to miss a single row.

There are a few things that make this particularly tricky. Each data warehouse speaks a slightly different dialect of SQL and has a different type system (which is not always well documented, as we've come to learn!). Each warehouse also has slightly different ingest characteristics (for example, Redshift has a hard cap of 16MB on any statement), meaning you need different data loading strategies to optimize throughput. Finally, most of the source databases we read data from are multi-tenant — meaning they contain data from multiple end customers, and part of our job is to make sure that the right data gets routed to the right customer. Again, it's pretty much mission-critical that we don't get this wrong, not even once.

As a result, we've invested in extensive testing a lot earlier than it makes sense for most startups to. We also tend to write code fairly defensively: we always try to think about the ways in which our code could fail (or anticipate what bugs might be introduced in the future), and make sure that the failure path is as innocuous as possible. Our backend is written in Go, our frontend is in React + Typescript (we're big fans of compiled languages!), we use Postgres as our application db, and we run the infra on Kubernetes.

The last piece we'll touch on is security and privacy. Since we're in the business of moving customer data, we know that security and privacy are paramount. We're SOC 2 Type II certified, and we go through annual white-box pentests to make sure that all our code is up to snuff. We also offer on-prem deployments, so data never has to touch our servers if our customers don't want it to.

It's kind of surreal to launch on here – we’re long time listeners, first time callers, and have been surfing HN since long before we first started dreaming about starting a company. Thanks for having us, and we're happy to answer any questions you may have! If you wanna take the product for a spin, you can sign up on our website or drop us a line at hn (at) prequel.co. We look forward to your comments!

Prequel (YC W21) – Sync data to your customer’s data warehouse

Get Top 5 Posts of the Week