What is medicinal chemistry? It’s part of discovering new drugs. A drug hunter decides what disease to focus on and then selects ‘targets’: usually proteins whose activity is key to the disease. Then they look for a molecule that can ‘hit’ that target and stimulate a response which will hopefully have beneficial effects. Developing such a molecule that is potent and safe is medicinal chemistry.
Despite it being a crucial part of drug development, this field has relied on trial-and-error approaches—a very expensive way to muddle toward a drug. Where computational tools have been used, they have emphasized the 'best' designs without any awareness of what it would take to physically make the drug in a lab and test it. Our approach is to apply computational methods that know how to make these designs.
We’ve been working on developing machine learning tools to advance the field for the last 3 years. Alpha formed a lab at Cambridge in 2017 to apply machine learning to drug discovery. Matt joined the group and soon some exciting results began to emerge, particularly in the area of how to make molecules. We published the first model to outperform trained human chemists in predicting the outcomes of chemical reactions. Alpha then got Aaron, his former mathematics classmate and debate partner at Oxford to leave his job for the world of drug discovery.
We decided to focus on the one challenge that exists at almost every step: molecules need to be made. No matter how clever it looks on paper, a molecule is worthless unless it can be tested in a lab. The task of actually making molecules, known as chemical synthesis, is often a challenging problem, involving the combinatorial explosion of games like Go with moves that can’t be defined in a simple rulebook.
You start with a set of simple molecules which can be combined through chemical reactions (a ‘move’) to form more and more complex molecules, known as the ‘route’, until you arrive at your desired drug candidate. But how to combine these molecules? Trial and error is not an option, given the enormous cost of doing chemistry, and just enumerating all options to a client is unhelpful given that your average molecule can have hundreds of theoretically-possible routes. Searching this tree of routes and scoring the viability of such routes is where ML becomes very powerful.
We developed a machine-translation approach which takes in reactants and outputs the product of a reaction; an approach very similar to how Google Translate operates. This allows us to score the viability of each move. We combine this with fast tree search algorithms, used in models like AlphaGo to efficiently search the large combinatorial space of possible reactions.
To get this technology in front of users, we're building a cloud-based platform. Clients input the molecule they want to be made, our system designs a route for how to make it, and then the client can order this molecule through our platform. We don’t own a lab, but we partner with chemical manufacturers around the world who execute the routes we design. Combining automated chemical synthesis with compound ordering creates a better experience for the drug hunter who wants to focus on their science and just wants a vial with their compound without the cumbersome process of figuring out how to make it and where to get it from.
All that is what we were working on until the pandemic hit... and now we can answer the second part of the title: COVID Moonshot.
We had just finished YC W20 when a tweet from a team of scientists quickly changed our travel (and company) trajectory. A team of scientists at Diamond Light Source in the UK had shown that a selection of chemical fragments were effective at binding to a key part of the COVID virus. We realised there were hundreds of chemists sitting at home, with their projects on hold, who could help take these fragments and turn them into genuine drug candidates—an open-science approach to crowdsourcing a new drug. We created a platform where designs could be submitted and hoped for maybe 50 to 100 submissions. In the first few weeks, we’ve received over 4000 submissions from 200 scientists around the world.
This was the start of a COVID Moonshot initiative that we are now helping lead. It is an international consortium of scientists drawn from academia, biotechs, and pharma, all working pro bono or at cost with no IP claims on any resulting drug candidates. The aim is to find an antiviral candidate for COVID-19 by the end of the year—a ‘moonshot’ of a time frame compared with the standard drug discovery paradigm.
That standard paradigm is unfortunately broken when it comes to pandemic-related diseases. Biology and chemistry are hard enough, but things become even intractable when there are little or no commercial incentives to develop new therapies. Sadly, this explains why promising antibiotic companies like Achaogen go bankrupt and why, even after SARS-CoV brought the Far East to a halt in 2003, we still didn’t invest in coronavirus therapies during the last 17 years. For therapies that only become critical once every few decades, we need a new approach to developing drugs.
We think that drug hunters can learn something from the CS community and its embrace of open source. Similarly to open-source software development, someone has to manage the roadmap and triage suggestions. For Moonshot, the candidate drug submissions are great but we obviously can’t make and test all of them, so how do you pick the most promising ones? Here is where our technology comes in: it can identify which candidates can be synthesized easily. Since in a pandemic you need to move quickly, prioritizing compounds that can be synthesized easily is a natural triaging mechanism. Where a human chemist would take 3-4 weeks, we were able to design synthetic routes for all submissions within 48 hours. The top route designs were then passed on to our chemical manufacturing partner for synthesis. We’ve now experimentally tested over 500 compounds and found several promising candidates which we are now testing further. All data is publicly available on the site: https://postera.ai/covid
Inspired by open-source software, we’re seeing advantages of open-science collaboration in areas where market incentives are lacking. We started with the opportunity to connect drug hunters with the latest ML, but have expanded this into a platform that helps connect scientists with each other. This is particularly needed when it comes to drug discovery logistics—the fragment screens are conducted in Oxford and The Weizmann Institute in Israel, computational methods are done by PostEra in California and Memorial Sloan Kettering Cancer Center in New York, and chemical synthesis is carried out across several countries. Many of the features we are rolling out, such as automated alerts on suggested drug designs, open forum discussions, and live data uploads, feel very akin to a ‘GitHub for drug development’.
Identifying biological mechanisms of diseases and forecasting clinical outcomes are huge problems, but we believe that the chemistry stage of drug discovery can become a reliable industry rather than an artisanal craft. Machine learning tech is a key part and we're still working on it, but our clients have been constantly reminding us that just the logistical aspects of drug discovery are a great source of pain. Science software is also notoriously hard to use so we've learned that combining good UI with good ML should be our ambition. Our current mantra is: ordering a molecule through PostEra should be as easy as ordering a pizza!
We need more researchers, coders and chemists to help us on this journey and we’d love to hear from you if our vision sounds like something you could get on board with! Here are the open positions within the company we are now actively hiring for: https://www.workatastartup.com/companies/13332
Over to you, HN! We're eager to hear your feedback, questions, ideas and experiences in this area.