Runops (YC W21) – A better cloud shell for production apps

Hello HN! I'm Andrios, from Runops.io - we're building a proxy to commands you run in the terminal that adds Git, code reviews in Slack, and removes sensitive data from results. It's like the Cloud Shells from GCP/AWS, but with more features and using your local zsh/bash terminal.

You run an AWS CLI command in the terminal and it goes to Runops instead of AWS. Runops adds the command to Git and gets peer reviews (when required) in Slack before sending it to AWS. After it runs, we deliver the results back in the terminal, but with all sensitive data masked. It works for AWS, Kubernetes, databases, and others.

I was leading the Infra team at a Fintech (pismo.io/en), and we wanted to give autonomy to all developers in production. But we couldn’t give them direct access due to compliance requirements. The solution was to have a small number of people (my team) with "full access" to production systems. Engineers would ask us when they needed to run one-off scripts in production. Our goal was to deliver automations so that other teams wouldn't need to ask us to do things. We would build a way for them to do it with compliance, security, and reliability.

It didn't work. We were spending 80% of the time processing the queue of requests, and 20% building automations. The backlog was always increasing, and the team was burning out. Engineers were not happy as their requests took a long time to process and clients were angry at them.

But some nice automations came out of that. For instance: we needed to review ad-hoc prod database reads to avoid bad queries. So we built a Jenkins pipeline that ran SQL queries from Git after code review using Flyway. Any engineer could run queries in prod, leaving traces on who did it, reviews, when it happened, and why, for every query.

When talking to friends at similar companies, I saw the problem was even worse. Some of them weren't trying to automate, they already had dedicated people for running these scripts, i.e., an ops team. I knew there was a better way, so I set out to build it. I quit this job mid last year, with about 8 months' worth of savings to make this work before I'd need to find a job again. It was tough in the beginning, as I’m an engineer and had to learn sales, marketing and product management on the job, but after getting the first few customers things started improving.

The goal for Runops is to let any engineer run anything in production as if they had full access, automating as much as possible of security and compliance. When human interaction is needed, we make it synchronous using Slack. Now, instead of having a single team as a bottleneck, you can have everyone do things in production. Centralizing teams with most of the access to AWS, Kubernetes, and databases is bad. It makes for slow Change Management processes using Jira or other tools with manual executions at the end. Runops let’s you add quick reviews from experts (Infra, DBA, security, etc), and automates executions.

The primary interface is a CLI, where you run scripts that goes from SQL queries to kubectl exec and AWS CLI commands. We don't create new abstractions, you use the same commands and docs available, we just proxy them. A nice benefit is replacing VPNs and the 10 client tools/credentials you would need today. We also support templates for custom actions in a bunch of languages.

We built it using Github Actions for executing commands. We store configurations and credentials as Actions Secrets and they get injected when a command requires them. It's nice because we can run anything that goes in a Docker container in <15 seconds. We have plans to improve it beyond Actions by creating a real-time proxy. That will enable a REPL-like experience. Runops doesn't have a web interface, this is on purpose, we don't want to be one more tool engineers have to learn. Most interactions happen with our CLI or Slack. We have a simple admin UI in Retool.

We do everything using Lisp. The CLI uses Clojurescript; the REST API uses Clojure. It's great to have the same language everywhere, and Lisp is also a fantastic advantage.

Today we have big Fintechs using Runops. They use it to let developers run commands inside Kubernetes pods, like Rails Runner and Elixir IEx, SQL queries, DynamoDB queries, and making internal API calls in private networks using cURL. One of the best parts of building this has been seeing developers doing more production work. Regulated companies that never considered giving this level of autonomy to all developers are changing their minds. It's great to see a tool impacting the culture, increasing trust.

We're really happy we get to show this to you all, thank you for reading about it! Please let us know your thoughts and questions.



Get Top 5 Posts of the Week



best of all time best of today best of yesterday best of this week best of this month best of last month best of this year best of 2023 best of 2022 yc w24 yc s23 yc w23 yc s22 yc w22 yc s21 yc w21 yc s20 yc w20 yc s19 yc w19 yc s18 yc w18 yc all-time 3d algorithms animation android [ai] artificial-intelligence api augmented-reality big data bitcoin blockchain book bootstrap bot css c chart chess chrome extension cli command line compiler crypto covid-19 cryptography data deep learning elexir ether excel framework game git go html ios iphone java js javascript jobs kubernetes learn linux lisp mac machine-learning most successful neural net nft node optimisation parser performance privacy python raspberry pi react retro review my ruby rust saas scraper security sql tensor flow terminal travel virtual reality visualisation vue windows web3 young talents


andrey azimov by Andrey Azimov