I was inspired to start Meticulous from my time at Dropbox, where we had regular 'bug bashes' for our UX. Five or six engineers would go to a meeting room and click through different flows to try to break what we built. These were effective but time consuming—they required us to click through the same set of actions each time prior to a release.
This prompted me to start thinking about replaying sessions to automatically catch regressions. You can't replay against production since you might mutate production data or cause side effects. You could replay against staging, but a lot of companies don't have a staging environment that is representative of production. In addition, you need a mechanism to reset state after each replayed session (imagine replaying a user signing up to your web application).
We designed Meticulous with a focus on regressions, which I think are a particularly painful class of bug. They tend to occur in flows which users are actively using, and the number of regressions generally scales with the size and complexity of a codebase, which tends to always increase.
You can use Meticulous on any website, not just your own. For example, you can start recording a session, then go sign up to (say) amazon.com, then create a simple test which consists of replaying against amazon.com twice and comparing the resulting screenshots. You can also watch recordings and replays on the Meticulous dashboard. Of course, normally you would replay against the base commit and head commit of a PR, as opposed to the production site twice.
Our API is currently quite low-level. The Meticulous CLI allows you to do three things:
1) You can use 'yarn meticulous record' to open a browser which you can then use to record a session on a URL of your choice, like localhost. You can also inject our JS snippet onto staging, local, dev and QA environments if you want to capture a larger pool of sessions. This is intended for testing your own stuff! If you inject our snippet, please ask for the consent of your colleagues before recording their workflows. I would advise against production deployments, because our redaction is currently very basic.
2) You can use 'yarn meticulous replay' to replay a session against a URL of your choice. During replay, we spin up a browser and simulate click events with Puppeteer. A list of exceptions and network logs are written to disk. A screenshot is taken at the end of the replay and written to disk.
3) You can use 'yarn meticulous screenshot-diff' to diff two screenshots.
There are lots of potential use cases here. You could build a system on top of the screenshot diffing to detect major regressions with a UX flow. You could also try to diff exceptions encountered during replay to detect new uncaught JS exceptions. We plan to build a higher-level product which will provide some testing out of the box.
Meticulous captures network traffic at record-time and mocks out network calls at replay-time. This isolates the frontend and avoids causing any side effects. However, this approach does have a few problems. The first is that you can't test backend changes or integration changes, only frontend changes. (We are going to make network-stubbing optional, though, so that you can replay against a staging environment if you wish.) The second problem with our approach is that if your API significantly changes, you will need to record a new set of sessions to test against. A third problem is that we don't yet support web applications which rely heavily upon server-side rendering. However, we felt these trade-offs were worth it to make Meticulous agnostic of the backend environment.
Meticulous is not going to replace all your testing, of course. I would recommend using it in conjunction with existing testing tools and practices, and viewing it as an additional layer of defense.
We have a free plan where you can replay 20 sessions per month. I've temporarily changed our limit to 250 for the HN launch. Our basic plan is $100/month. The CLI itself is open-source under ISC. We're actively discussing open sourcing the record+replay code.
I'd love for you to play around with Meticulous! You can try it out at https://docs.meticulous.ai. It's rough around the edges, but we wanted to get this out to HN as early as possible. Please let us know what you might want us to build on top of this (visual diffs? perf regressions? dead code analysis? preventing regressions?). We would also love to hear from people who have built any sort of replay testing out at their company. Thank you for reading and I look forward to the comments!