Close this search box.

The untapped power of multivariate analysis for target identification in cancer drug discovery

With single-gene based drug target identification methods getting saturated, the scientific world is venturing beyond traditional approaches to enhance early drug discovery. We explore how combining multi-disciplinary cutting-edge tools, including multivariate analysis, can help in identifying multi-gene targets and improve patient outcomes through precision medicine.

For years now, scientists have focused on differential gene expression algorithms to direct early drug discovery. Based on the technique of activating different genes in a cell to define its function, these algorithms have allowed us to analyze data on individual genes properties. 

Biostatistical approaches are then applied to these datasets to filter out a short list of relevant gene targets – that influence specific disease pathways – for developing effective therapies.

Table of contents

    From traditional to modern multi-omics driven multivariate analysis approaches

    However, the traditional approach has become heavily saturated today due to decades of reliance on monogenetic search strategies, according to Daniel Beck, Vice President, Biomathematics & Bioinformatics at the German founded global biotech company Indivumed Therapeutics. 

    “By concentrating only on singular properties of genes, there is a risk of detecting targets that have been extensively studied before.”

    Daniel Beck, Vice President, Biomathematics & Bioinformatics at Indivumed Therapeutics

    “Such approaches can also neglect important contextual factors. For example, exploring combinatorial effects between genes is crucial as molecules frequently interact with each other to perform biological functions – such multi-gene effects may be overlooked in traditional approaches.”

    Beck highlighted that this has caused a shift towards multivariate methods, which study interactions among multiple variables to account for contextual factors. These methods have the potential to elevate certain genes in the ranking that may not have previously stood out based solely on individual characteristics.

    Multivariate analyses can incorporate a wide range of variables. One critical set of inputs that have risen in prominence over the last decade are dubbed ‘multi-omics data’, derived from integrated biological interrogation of the genome, transcriptome, proteome, microbiome, and more.

    Additionally, multivariate analyses can include descriptive clinical data, patients’ treatment histories, as well as existing knowledge from publicly available information on contextual biological interactions like protein-protein interaction (PPI) networks. 

    “The mathematical process underlying multivariate target identification involves identifying active ‘gene modules’,” Beck elaborated. “Each module consists of a cluster of genes of interest that exhibit significant biological interactions with each other, mapped out based on the analyses.”

    “Such a comprehensive mathematical approach streamlines the target discovery process. It significantly enhances the likelihood of identifying appropriate target genes, while simultaneously assessing their suitability for developing effective therapies.”

    Scientists today have at their disposal an array of tools that can be chosen and combined to enhance the search for novel drug targets. However, drawing from his expertise in artificial intelligence (AI) and robotics, Beck advised careful consideration in choosing, and timing the use of these tools, in the early drug discovery process to enable success.

    For instance, machine learning, a subset of AI, is valuable in many scientific fields for making predictions based on recurring data patterns. Yet, it may not be suitable for all tasks across early drug discovery due to various challenges. 

    “First, biological systems are complex, and comprise several, intricate causal relationships,” Beck said. “Finding an appropriate mathematical model to identify patterns within causal relations and ‘solve’ biological problems is difficult – if not nearly impossible!”

    “Second, biological data is multidimensional,” he continued. “With such data types, a mathematical phenomenon called the ‘curse of dimensionality’ comes into play. It implies that as the number of dimensions increases, the amount of data needed to provide meaningful and reliable insights grows exponentially.” And ultimately, collecting such vast amounts of high-quality data is also time-consuming and costly.

    These issues can cause AI systems to take undesirable ‘shortcuts’ during analysis, potentially resulting in biased outcomes or unintended correlations. Thus, undue reliance on AI alone could lead to the selection of incorrect targets. 

    The untapped power of multivariate analysis for target identification in cancer drug discoveryThe untapped power of multivariate analysis for target identification in cancer drug discovery
    © Indivumed GmbH

    The promise of screening-ready target packages in drug discovery

    As an example of how his team of specialized scientists at Indivumed are utilizing a combination of state-of-the-art tools to offer screening-ready target packages while avoiding the pitfalls discussed above, Beck revealed, “Our proprietary methodology entails a comprehensive pipeline process for target identification.” 

    “Selecting the right target requires consideration of multiple properties, so our team does not rely on pure in-silico target selection alone. Instead, we prioritize established statistical methods and integrate biological augmentation to reduce drop-outs further down in the drug pipeline.”

    Indivumed’s target package development process begins right from the systematic curation of high quality biospecimen and corresponding patient data, facilitated by Indivumed’s extensive global network of selected partner clinics. 

    Strict sample collection and processing time-limits are prescribed in adherence to Indivumed’s standard operating protocols (SOPs). These SOPs consider critical factors such as cold ischemia time – the amount of time a sample can remain chilled before being processed – essential for preserving tissue integrity and providing high-quality inputs downstream. 

    Following sample and clinical data collection, the Indivumed team processes the inputs to derive annotated multi-omics and clinical data. 

    To enhance contextual understanding, information from public databases is also integrated using bioinformatic tools. This body of data is then interrogated using descriptive statistical approaches to generate gene modules. 

    “The idea is to associate target candidates with a vast set of data points from different sources, allowing optimized filtering and ranking based on multiple criteria, while reducing labor-intensive manual interpretation,” he added. “We also utilize AI for dimensionality reduction, streamlining the identification of crucial variables associated with survival outcomes within the gene modules.”

    The focus on quality throughout this process for Indivumed is deliberate – Beck emphasized – as integrated multivariate analyses using a combination of gene to phosphoproteomics data is highly informative, but the insights are only meaningful if quality control factors, such as cold ischemia time, are adhered to.

    Aided by the rich contextual information from these steps, Indivumed’s experts then interpret the gene modules to identify potential targets. Subsequently, lab validation is conducted to confirm the target selections.

    Finally, prior to delivering these screening-ready target packages to pharmaceutical partners, the lab team at Indivumed also identifies suitable screening methods that can help isolate therapeutics with the necessary effect on the chosen gene targets. 

    “To file a patent application, we require a gene target accompanied by a screening methodology,” Beck said. “With that in mind, leveraging the comprehensive insights derived through systematic characterization of our target candidates, our bioinformatics team applies advanced techniques to predict effective screening methods to facilitate downstream testing efforts.”

    The exciting future of precision medicine

    Despite recent advances in multi-omics and computation biology, Beck expressed that many exciting avenues remain to be explored that could further the promise of ‘precision medicine’. 

    “One such avenue to enhance precision medicine efforts potentially awaits us in the significant number of unexplored genes lying beyond the current protein-protein interaction (PPI) networks, which only covers 13,000 to 16,000 genes out of the known 20,000 protein coding genes,” Beck ventured. “Directing our attention to unexplored genes could bring both clinical and economic benefits.”

    “As we gather more samples, data and insights over time, we gain the ability to increasingly tailor therapeutic targets based on genetic characteristics of specific subgroups within populations. This leads us one step closer each day to developing even more efficacious therapies and improving health outcomes for all,” he concluded optimistically.

    If you would like to benefit from Indivumed Therapeutics’ quality-driven early discovery process focused on high-quality data inputs and comprehensive overviews of identified and validated target candidates, please click here.

    Images Courtesy: Indivumed GmbH