Companies are increasingly turning to generative artificial intelligence (AI) to develop new therapeutics, giving biotech companies with patents on these technologies a distinct advantage in the tech-enabled pharmaceutical future. One of the pioneers in using generative AI to process biological and chemical data in order to design new molecules is biotech company Insilico Medicine. The Company filed a patent application on its mutual information adversarial autoencoder in 2018 – and that patent was approved in 2022.
The patent covers Insilico’s proprietary system for using generative AI to produce novel small molecules that can be further synthesized, tested and advanced into clinical assets – the algorithmic backbone to its Chemistry42 engine that has produced new possible treatments for diseases like fibrosis, cancer, and COVID-19.
The process covered by the patent describes the use of deep neural networks(DNNs) – machine learning modules which process data and predict new outputs in order to “generate novel objects that are indistinguishable from data objects.”
Importantly, the patent covers a computer method for generating an object that satisfies a condition using one or more deep neural networks. Not only does the patent cover using DNNs to design a new molecule, but also the processing of data to arrive at that point, the comparison between the new molecule and the desired molecule, the selection of an object best meeting the desired conditions, obtaining a physical form of the new structure, and validating that physical form.
Fig 1. Combining various state-of-the-art machine learning methods, Chemistry42 delivers diverse, high-quality molecular structures within hours. As the structures are generated, they are dynamically assessed using the reward and scoring modules in the platform.
Putting the Generative AI for Chemistry Patent into Action
The generative AI methodology covered in the patent was later applied by Insilico scientists to generate desired molecular structures for given gene expression changes using adversarial autoencoder-based architecture. This research was published in Frontiers in Pharmacology in 2020.
In the paper, scientists demonstrated how their generative model could be used to predict drug molecules that would induce a desired change in gene expression. The code for this model – called the Bidirectional Adversarial Autoencoder – was made freely available to the research community online.
At the time of the paper’s publication, machine learning models were beginning to be explored to generate molecules with specific characteristics such as synthetic availability, binding energy, or activity against a given protein target. Insilico scientists used their patented method to scale these conditional models to solve a more complex biological problem – to reveal how drug incubation influences gene expression profiles.
The technology employed by Insilico Medicine recognized that a process as complex as drug discovery required a joint distribution of objects and conditions. Their Bidirectional Adversarial Autoencoder model could simultaneously optimize for pharmacophore properties, structural information, and cellular processes and then extract only molecules that met all desired criteria across a range of conditions.
Fig 2. The Bidirectional Adversarial Autoencoder explicitly separates cellular processes captured in gene expression changes into two feature sets – those related and unrelated to the drug incubation. The model uses related features to produce a drug hypothesis.
Applying Insilico’s Patented Generative AI Technology to New Drugs
In 2020, Insilico Medicine launched Chemistry42, the generative AI drug design portion of its end-to-end Pharma.AI platform, built on its patented technology. Chemistry42 utilizes deep learning architectures with reinforcement learning and applies it to the chemical space to generate novel molecular structures with predefined properties.
The technology has now been validated through numerous clinical trials. Insilico Medicine’s lead program – a completely novel potentially first-in-class molecule for the chronic, progressive lung disease idiopathic pulmonary fibrosis (IPF), was nominated as a preclinical candidate in December 2020, the first drug for an AI-discovered target and generated by an AI system to reach this milestone.
By February of 2022, Insilico’s IPF drug reached Phase 1 clinical trials, in under 30 months from discovery. In January 2023, those Phase 1 trials had produced positive topline results and in February 2023, the IPF drug received Orphan Drug Designation from the FDA. Soon, the Company will be announcing the launch of Phase 2 clinical trials with IPF patients.
Meanwhile, Insilico recently announced that its AI-generated drug for COVID-19 and related variants was approved for clinical trials, and it is advancing 30 additional drugs for indications including cancer and central nervous system diseases in its internal pipeline. Pharma companies have partnered with Insilico as well to utilize the Company’s patented technology to advance their own programs, including Fosun Pharma, which is developing anti-cancer therapies with Insilico, and Sanofi, which signed a strategic research collaboration agreement worth $1.2B with Insilico in 2022.
Fig 3. The three-step workflow for a de novo generative experiment using the Chemistry42 platform. Step 1: Input. users upload their data and configure the platform with the desired properties for the generated structures. Step 2: Generation. An ensemble of 40+ generative models functions in parallel to generate the novel structures. A variety of filters scrutinize the generated molecular structures in the generation phase. The molecular structures are then subjected to multiple sets of reward and scoring modules, classified as either 2D or 3D modules, that dynamically assess the generated structures’ properties according to the predefined criteria. Additional custom scoring modules (such as ADME predictors) can also be integrated into the reward pipeline to prioritize the generated structures. These modules form the backbone of Chemistry42’s multiagent reinforcement learning (RL)-based generation protocol. Generated structures’ scores are fed back to the generative models to reinforce them and guide the generative process toward high-scoring structures─this is called the learning phase. Step 3: Analysis. The generated structures are automatically ranked according to customizable metrics based on their predicted properties, including synthetic accessibility, novelty, diversity, etc. The platform also provides users with interactive tools to monitor generative model performance.
Topics: Emerging Technologies