MutableAI (YC W22) – Automatically clean Jupyter notebooks using AI

Hi HN, I’m Omar the Founder and CEO of MutableAI (YC W22) (https://mutable.ai). We transform Jupyter notebook code into production-quality Python code using a combination of AI (OpenAI codex) and PL metaprogramming techniques.

I'm obsessed with clean code because I've written so much terrible code in the past. I went from being a theoretical physics PhD dropout -> data scientist -> software engineer at Google -> research engineer at DeepMind -> ML engineer at Apple. In that time I've grown to tremendously value code quality. Clean code is not only more maintainable but also more extensible as you can more readily add new features. It even enables you to think thoughts that you may have never considered before.

I want to reduce the cost of clean, production-quality code using AI, and am starting with a niche I'm intimately familiar with (Jupyter), because it's particularly prone to bad code. Jupyter notebooks are beloved by data scientists, but notorious for having spaghetti code that is low on readability, hard to maintain, and hard to move into a production codebase or even share with a colleague. That’s why a Kaggle Grandmaster shocked his audience and recommended that they do not use Jupyter notebooks [1].

MutableAI allows developers to get the best of both worlds: Jupyter’s easy prototyping and visualization, plus greatly improved quality with our AI product. We also offer a full featured AI autocomplete to help prototyping go faster. I think the quadrant of "easy to develop in" and "easy to create high quality code" has been almost empty, and AI can help fill this gap.

Right now there are two ways of manipulating programs: PL techniques for program analysis and transformation, and large scale transformers from OpenAI/DeepMind, which are trained on code treated as text (tokens) and don't look at the tree structure of code (ASTs). MutableAI combines OpenAI Codex / Copilot with traditional PL analysis (variable lifetimes, scopes, etc.) and statistical filters to identify AST transformations that, when successively applied, produce cleaner code.

We use OpenAI's Codex to document and type the code, and for AI autocomplete. We use PL techniques to refactor the code (e.g. extract methods), remove zombie code, and normalize formatting (e.g. remove weird spacing). We use statistical filters to detect opportunities for refactoring, for example when a large grouping of variable lifetimes are suddenly created and destroyed, which can be an opportunity to extract a function.

Some of the PL techniques are similar to traditional refactoring tools, but those tools don’t help you decide when and how to refactor. We use AI and stats to do that, as well as to generate names when the new code needs them.

A tool that reduces the time to productionize code can be compared to having an extra engineer on staff. If you take this seriously, that’s a pretty big market. Stripe Research claims that developer inefficiency is a $300B problem [2]. Just about every tech company would become more efficient through increased velocity, fewer errors, and the ability to tackle more complex problems. It may even become unthinkable to write software without this sort of tool, the same way most people don't write assembly and use a compiler.

You can try the product by visiting our website, https://mutable.ai and creating an account on the setup page https://mutable.ai/setup.html. License keys are on the setup page once you’ve signed up (check your mailbox for an email verification link). I’ve bumped up the budget for free accounts temporarily for the day, I hope you enjoy the product !

In addition to inviting the HN community to try out the product, I’d love it if you would share any tips for reducing code complexity you’ve come across and of course to hear your ideas about this problem and tools to address it.

[1] https://youtu.be/tsGGpe-onZI?t=1067

[2] https://stripe.com/files/reports/the-developer-coefficient.p...



Get Top 5 Posts of the Week



best of all time best of today best of yesterday best of this week best of this month best of last month best of this year best of 2023 best of 2022 yc w24 yc s23 yc w23 yc s22 yc w22 yc s21 yc w21 yc s20 yc w20 yc s19 yc w19 yc s18 yc w18 yc all-time 3d algorithms animation android [ai] artificial-intelligence api augmented-reality big data bitcoin blockchain book bootstrap bot css c chart chess chrome extension cli command line compiler crypto covid-19 cryptography data deep learning elexir ether excel framework game git go html ios iphone java js javascript jobs kubernetes learn linux lisp mac machine-learning most successful neural net nft node optimisation parser performance privacy python raspberry pi react retro review my ruby rust saas scraper security sql tensor flow terminal travel virtual reality visualisation vue windows web3 young talents


andrey azimov by Andrey Azimov