We met as undergrads at Georgia Tech and come from a DevOps and operations background so we’ve seen this first hand. Each of us has over 15 years of experience building high-reliability systems, starting in the early days with satellite earth station monitoring. As interns we once wrote a bug that caused a 32 meter antenna to try to point down through the earth, almost flattening the building we were in. It was a great environment to learn about engineering reliability. We leveraged this experience to tackle monitoring Java app servers, SOA, SaaS observability and cloud data warehouses. What if we could use a form of observability data to automatically test the reliability of new deployments before they hit production? That’s the idea that got us started on Speedscale.
Most test automation tools record browser interactions or use AI to generate a set of UI tests. Speedscale works differently in that it captures API calls at the source using a Kubernetes sidecar [3] or a reverse proxy. We can see all the traffic going in and out of each service, not just the UI. We feed the traffic through an analyzer process that detects calls to external services and emulates a realistic request and response -- even authentication systems like OAUTH =). Unlike guessing how users call your service, Speedscale automation reflects reality because we collected data from your live system. We call each interaction model a Scenario and Speedscale generates them without human effort leading to an easily maintained full-coverage CI test suite.
Scenarios can run on demand or in your build pipeline because Speedscale inserts your container into an ephemeral environment where we stress it with different performance, regression, and chaos scenarios. If it breaks, you can decide the alerting threshold. Speedscale is especially effective in ensuring compliance with subtle Service Level Objective (SLO) conditions like performance regression [4].
We're not public yet but would be happy to give you a demo if you contact us at [email protected]. Also, we are doing alpha customer deployments to refine our feature set and protocol support - if you have this problem or have tried to solve it in the past we would love to get your feedback. Eventually we’ll end up selling the service via a subscription model but the details are still TBD. For the moment we’re mainly focused on making the product more useful and collecting feedback. Thanks!
[1] https://services.google.com/fh/files/misc/state-of-devops-20...
[2] https://aws.amazon.com/builders-library/automating-safe-hand...
[3] https://kubernetes.io/blog/2015/06/the-distributed-system-to...
[4] https://landing.google.com/sre/sre-book/chapters/service-lev...