Batch (YC S20) – Replays for event-driven systems

Read Post

Hello HN!

We are Ustin and Daniel, co-founders of Batch (https://batch.sh) - an event replay platform. You can think of us as version control for data passing through your messaging systems. With Batch, a company is able to go back in time, see what data looked like at a certain point and if it makes sense, replay that piece of data back into the company's systems.

This idea was born out of getting annoyed by what an unwieldy blackbox Kafka is. While many folks use Kafka for streaming, there is an equal number of Kafka users that use it as a traditional messaging system. Historically, these systems have offered very poor visibility into what's going on inside them and offer (at best) a poor replay experience. This problem is prevalent pretty much across every messaging system. Especially if the messages on the bus are serialized, it is almost guaranteed that you will have to write custom, one-off scripts when working with these systems.

This "visibility" pain point is exacerbated tenfold if you are working with event driven architectures and/or event sourcing - you must have a way to search and replay events as you will need to rebuild state in order to bring up new data stores and services. That may sound straightforward, but it's actually really involved. You have to figure out how and where to store your events, how to serialize them, search them, play them back, and how/when/if to prune, delete or archive them.

Rather than spending a ton of money on building such a replay platform in-house, we decided to build a generic one and hopefully save everyone a bunch of time and money. We are 100% believers in "buy" (vs "build") - companies should focus on building their core product and not waste time on sidequests. We've worked on these systems before at our previous gigs and decided to put our combined experience into building Batch.

A friend of mine shared this bit of insight with me (that he heard from Dave Cheney, I think?) - "Is this what you want to spend your innovation tokens on?" (referring to building something in-house) - and the answer is probably... no. So this is how we got here!

In practical terms, we give you a "connector" (in the form of a Docker image) that hooks into your messaging system as a consumer and begins copying all data that it sees on a topic/exchange to Batch. Alternatively, you can pump data into our platform via a generic HTTP or gRPC API. Once the messages reach Batch, we index them and write them to a long-term store (we use https://www.elassandra.io). At that point, you can use either our UI or HTTP API to search and replay a subset of the messages to an HTTP destination or into another messaging system.

Right now, our platform is able to ingest data from Kafka, RabbitMQ and GCP PubSub, and we've got SQS on the roadmap. Really, we're cool with adding support for whatever messaging system you need as long as it solves a problem for you.

One super cool thing is that if you are encoding your events in protobuf, we are able to decode them upon arrival on our platform, so that we can index them and let you search for data within them. In fact, we think this functionality is so cool that we really wanted to share it - surely there are other folks that need to quickly read/write encoded data to various messaging systems. We wrote https://github.com/batchcorp/plumber for that purpose. It's like curl for messaging systems and currently supports Kafka, RabbitMQ and GCP PubSub. It's a port from an internal tool we used when interacting with our own Kafka and RabbitMQ instances.

In closing, we would love for you to check out https://batch.sh and tell us what you think. Our initial thinking is to allow folks to pump their data into us for free with 1-3 days of retention. If you need more retention, that'll require $ (we're leaning towards a usage-based pricing model).

We envision Batch becoming a foundational component of your system architecture, but right now, our #1 goal is to lower the barrier to entry for event sourcing and we think that offering "out-of-the-box" replay functionality is the first step towards making this happen.

.. And if event sourcing is not your cup of tea - then you can get us in your stack to gain visibility and a peace of mind.

OK that's it! Thank you for checking us out!

~Dan & Ustin

P.S. Forgot about our creds:

I (Dan), spent a large chunk of my career working at data centers doing systems integration work. I got exposed to all kinds of esoteric things like how to integrate diesel generators into CMSs and automate VLAN provisioning for customers. I also learned that "move fast and break things" does not apply to data centers haha. After data centers, I went to work for New Relic, followed by InVision, Digital Ocean and most recently, Community (which is where I met Ustin). I work primarily in Go, consider myself a generalist, prefer light beers over IPAs and dabble in metal (music) production.

Ustin is a physicist turned computer scientist and worked towards a PhD on distributed storage over lossy networks. He has spent most of his career working as a founding engineer at startups like Community. He has a lot of experience working in Elixir and Go and working on large, complex systems.

Batch (YC S20) – Replays for event-driven systems

Get Top 5 Posts of the Week