Biotech

Genomic barcoding for clonal diversity monitoring and control in cell-based complex antibody production – Scientific Reports

Time Stamp: June 24, 2024 8:00 PM
Source Node: 621760

Low transcriptome diversity within cell line development workflow

We were interested in the cellular population diversity at different stages of an isogenic TI CLD platform²². This platform is based on simultaneous dual-plasmid RMCE-mediated targeted integration into a single genomic locus thus generating isogenic cells, which theoretically excludes variability derived from position effects, copy number and epigenetic silencing (Fig. 1a).

We harvested cells at three different stages of the CLD process: the GFP-expressing host cell (“host”), a stable bispecific antibody expression pool with a distinct gene configuration in our TI platform (“stable pool”), and a final producer clone (“clone”) that has been derived from the same stable pool. Each population was transcriptionally profiled on single-cell level using scRNAseq (Chromium Single Cell 3’ solution) and, after merging and batch correction, we observed no relevant cell population substructures (Fig. 1b). We hypothesized that variability in the expression of genes encoding the recombinant protein is insufficient to contribute to significant global transcriptomic differences during the CLD process.

We therefore specifically analyzed GFP expression within the host population and noticed some degree of variability with a fraction of cells showing no GFP expression (Fig. 1c, left panel). This variability was more pronounced in stable pools where a substantial fraction of cells showed no detectable expression of heavy chain 1 (HC1) and 2 (HC2, Fig. 1c, middle and right panel). In contrast, we observed a substantially smaller variability in HC1 and HC2 expression in a monoclonal cell population. This population had underwent recent single cell cloning and had been pre-selected based on high production performance.

The data indicate that while the overall cellular gene expression is remarkably similar across transfection, stable pool selection, and single cell cloning, transgene expression remains variable in host cells and stable expression pools.

Single-copy targeted barcode integration in stable CHO producer cells

To shed light on the source of transgene expression variety we aimed to implement a genetic lineage tracing method within our CLD workflow. To achieve this, we developed an exhaustive single-copy genetic barcode labeling method, implemented within a state-of-the-art CLD workflow applicable for therapeutic protein production. Most barcode delivery methods (retroviral-based) lead to an inhomogeneous labeling of the population with possibly no or multiple barcode integration per individual cell clone. In contrast, the implementation of a barcode within an isogenic dual-plasmid RMCE-mediated targeted integration into a single genomic locus, allows for the selective expansion of clones with mainly single-copy integration²². Notably, the start codon of the puromycin resistance gene is placed on the “Front” expression vector, ensuring that only cells with in-frame- and targeted-integration survive the selection procedure. Additionally, all cells with off-target integration of the expression plasmids do not lose the Thymidine kinase selection marker that is encoded in the landing pad of the host cell line. Overall, only clonal cells undergoing correct on-target recombination between the three LoxP sites become resistant to puromycin and survive in the presence of FIAU. This stringent selection process substantially increases the proportion of single-copy targeted integration survivors.

A N15 barcode region was included into one of two plasmids (“Back”) adjacent to the lox site outside of the coding sequence (Fig. 2a). The N15 region is placed in close proximity to the genomic area outside the landing pad, allowing discrimination between on-target and off-target integration events by positioning of the primer binding sites during amplicon deep sequencing. To additionally incorporate cell line metadata, we added 10 fixed positions to the N15 region and devised a nucleotide representation of year, number of CLD (in the respective year), used host cell line and expressed biotherapeutic molecule (Fig. 2b). We validated the plasmid library by amplicon deep sequencing and observed a near uniform barcode representation with homogenous nucleotide composition at each position (Supplementary Fig. 1a,b). This provides a minimum diversity of > 2 × 10⁷, enough to label 10⁵ cells with < 0.3% collision probability (Supplementary Fig. 1c,d).

To investigate a representative repertoire of therapeutic proteins produced in CHO cell lines, we selected three different molecules based on the observed viability loss of the cell population during stable pool selection: 5–10%: M1 DutaFab, 1–5%: M2 TCB, < 1%: M3 BS-Fusion (Fig. 2c). We analyzed the clonal diversity of respective CHO cell pools expressing M1-M3 during stable pool selection at two time points: (i) pre-selection at day five post-transfection and (ii) post-selection at the day the cell population reached ~ 80%viability.

We transfected 4.5 × 10⁶ cells of the host cell line with respective antibody-encoding TI and Cre recombinase-encoding plasmids by electroporation at day 0 followed by a recovery phase without selection pressure. Selection pressure (+ Puromycin & FIAU) was started subsequently at day 5 (“Pre-Selection”) and lowest cell viability was reached at day 13 for all CHO pools. The recovery time until reaching ~ 80% cell viability differed dramatically depending on the complexity of the encoded molecule and associated gene configuration (M1: day 20, M2: day 22, M3: day 26) (Fig. 2c). This observation was consistent with our previous experience showing that the speed of CHO pool recovery during resistance marker-based stable cell pool selection is linked to the complexity of molecules encoded on the expression plasmids (unpublished observation). DutaFab (M1) expressing cell lines recover quickly, potentially because of their overall smaller size and corresponding smaller plasmid sizes. In contrast, TCBs (M2) and BS-Fusion (M3) molecules are complex multi-domain fusion molecules, which makes them increasingly difficult-to-express for CHO biofactories²⁸. Interestingly, pool composition was approximately 3.5–4.0 times higher at the pre-selection time point (M1: Ø 10060, M2: Ø 6355, M3: Ø 4560) as compared to post-selection across molecules, indicative of rapid clone loss during the stringent selection process (Fig. 2d). Recovered stable pools consisted of a low total amount of barcodes (M1: Ø 2884, M2: Ø 1691, M3: Ø 1158) with a skewed population distribution already at post-selection. Notably, in M3 the most abundant barcode encompassed 10% of the population at the post-selection time point.

Next, we analyzed population dynamics of the three M1-M3-expressing stable CHO pools for a total of 11 weeks with selection pressure. In all three biological replicates, the number of barcodes detected in each population decreased substantially with loss of 80–87% of barcode variants over the observed time course (Fig. 2e). This indicates that stable CHO pools display rapid clonal dynamics under standardized cell cultivation conditions.

Overall, these experiments demonstrate that CHO producer pools generated by TI display a low diversity, which further depends on the expressed molecules, and undergo rapid population skewing towards dominant clones.

Improving efficiency of limited dilution and alternative proof-of-monoclonality by genetic barcoding

Motivated by the success of using cellular barcoding for monitoring CHO producer pools, we next explored the use of barcoding for assurance of monoclonality. To limit heterogeneity of cell banks and ensure consistent product quality, proof of monoclonality has become an important measure of regulatory-approved antibody manufacturing processes. Genetic barcoding offers the inclusion of a cell-intrinsic nucleotide marker which can be repetitively used to validate monoclonality and identity at any given stage and time of a given antibody producer cell clone throughout the production process. Similar approaches based on NGS-analysis of single nucleotide variants or targeted locus amplification products have been published recently^29,30. However, we speculated that the assessment of genetic barcodes at a pre-defined stable locus offers higher sensitivity, i.e. detection of minor subpopulations below 1%, and is not subject to change during cultivation of clonal cell lines.

We cross-validated two monoclonal cell lines by image detection at single cell cloning stage and subsequent Sanger sequencing of barcodes at day 18 (Supplementary Fig. 2a–c). To test the sensitivity of barcode detection within our workflow we mixed the two validated barcoded cell clones at different ratios and measured barcode occurrence via deep sequencing (> 36 × 10⁶ reads). To discriminate genuine barcodes from background introduced by sequencing errors, we included an unbiased knee-point filter method and detected clonal cross-contamination reliable in mixtures at ratios of 1:10–1:1000 (Supplementary Fig. 3a). In addition, we could detect 3, 5, and 17 different monoclonal cell lines in a defined pool (Supplementary Fig. 3b).

Monoclonality is traditionally validated by microscopy after limited dilution to achieve a single cell per well based on Poisson distribution³¹. However, limited dilution (LD) is inherently inefficient with most wells either empty or containing more than one cell. To determine if genetic barcoding can improve the single cell cloning process, we compared the number of clones detected with traditional image detection and manual inspection with the amount of clones detected by cellular barcoding. First, we mimicked a traditional single cell cloning process by limited dilution using a Poisson parameter λ = 0.6 (Fig. 3a). We found that for cells, which were classified as monoclonal by traditional image detection, barcoding confirmed the presence of a single barcode in all observed cases (Fig. 3b). Notably, image detection overestimated the number of clones per well by ~ 60% as compared to barcode detection (Fig. 3b). We hypothesized that the number of clones is overestimated by image detection because of poor outgrowth rates during limited dilution.

Therefore, for wells with 2 cells, we inspected consecutive images of wells on d2 after seeding. Notably, we frequently observed only 1 cell with distinct cell division events (Fig. 3c,d, top 3 panels). In one case, we observed cell divisions of both cells, and another case with a potential cell division event (Fig. 3c,d, bottom 2 panels). In case only a single cell survives and gives rise to a new clonal population we should observe a single genuine barcode. We analyzed the new potential clonal populations by deep sequencing at day 18 after seeding. Indeed, for clones where we previously identified only one cell survivor, only a single genuine barcode was detected in the population (Fig. 3e, top 3 panels). In contrast, we observed that for 2 cell survivors, 2 distinct barcodes were detected (Fig. 3e, bottom 2 panels). The barcode analyses also confirmed the presence of only 1 genuine barcode for the cell division event. Intrigued by the possibility to redefine assurance of clonal derivation by a cell intrinsic feature, we calculated the probability of clonality (PoC) when exchanging imaging evidence with barcode analysis. First, we assessed project-specific survival statistics, which represents the best approximation of PoC in the absence of imaging and method-validation studies (α = 0.372, based on 1552 wells with confluence > 10% at day 18 out of a total of 7767 plated wells) Ref.³¹. Next, based on the known relative frequency of barcodes at the time of limited dilution (Table S2), we estimated a “worst-case” probability for barcode collisions in all cases of an amount of k cells > 1 per well. Finally, this calculated to a PoC of 99.63%, when multiplying the probabilities for an amount of k > 1 cells in one well with the probabilities that: (i) at least two barcodes collide and (ii) both cells survive and form colonies. Collectively, these data indicate that barcode detection not only confirmed results of monoclonality assessment via image detection during limited dilution, but outperforms imaging evidence for assessment of PoC. Notably, imaging evidence overestimates the number of clones because of non-proliferating and duplet cells, while barcoding only counts viable monoclonal populations. Thus, NGS-derived cellular barcode readouts represent an improved imaging-independent monoclonality assessment method for CHO producer cell lines, offering a very high PoC (> 99.5%) by analysis of a cell intrinsic feature and project-specific survival statistics³¹. In addition, our barcode methods enables the option to revisit cell line identity (i.e. exclude clone mix-ups) and integrity (i.e. clone cross-contamination) at any given stage and time during the CLD process.

Cells originating from individual RMCE events share cellular phenotypes

Despite exclusion of position effects and copy number variation by using targeted integration technologies (e.g. RMCE-based), cell clones generated from stable expression pools display a relatively high variability of production-relevant readouts such as volumetric titer, metabolite profile and growth rates¹⁵. The described genetic barcoding method allows us to trace clonal CHO lineages from the time point of transfection onwards. Importantly, this allows discrimination between related cell clones originating from the same TI event but derived from different single cell cloning events (“sibling clones” that share the same barcode sequence and occurred from a cell duplication event in the CHO pool after transfection) and those from different TI events (“relative clones” with different barcodes) (Fig. 4a).

To test whether the phenotypic variability in cell clones is a stochastic event or whether it was predetermined, we generated cell clones from one barcoded CHO producer pool. Cell clones were selected randomly during limited dilution with a confluence threshold of 10% at day 12 in the 96-well plate. The composition of barcodes within all tested clones in the ambr15 stage was comparable to the barcode composition within the originating cell pool (Fig. 4b). Notably, frequent and rare barcode variants (from the original pool) were present in the final clonal populations. Cell clones were then tested for production-relevant markers using a downscale micro bioreactor system (ambr15).

Remarkably, cell clones clustered partially based on antibody chain transcript levels at day 10 of the ambr15 fed batch run (Fig. 4c). We speculated that cells originating from the same TI events (“sibling clones”, same barcode) may show less phenotypic variance as compared to cells from distinct TI events (“relative clones”, unique barcodes). To holistically compare phenotypic distance between clones we next compared pairs of absolute differences within all measured phenotypic data points. We selected 34 “sibling clones” (3 or more barcode occurrences) and 33 “relative clones” and observed a significantly lower variance in the group of “sibling clones” as compared to the group of “relative clones” for secreted antibody chain fragments (Fig. 4d), cellular features (Fig. 4e), and metabolite consumption (Fig. 4f). A list of all tested parameters which were statistically significant is provided (Table S1). The lower phenotypic distance was also apparent when we compared product titer with overall product quality (main peak CE-SDS), where we observed clusters of sibling clones(Fig. 4g)). In agreement with our previous results, bulk transcriptomic profiling during the ambr15 fed batch revealed little overall differences. PCA of gene expression between clones displayed low variation, PC1 explaining 14.24% and PC2 9.77% of variation (Fig. 4h). Notably, antibody chain expression was dominant in the component loading of PC1 and PC2.

The lower variance observed within clones sharing the same barcodes (“sibling clones”) raises the question as to how phenotypic variance is generated within the cell line generation process. A recent study by Weinguny and colleagues hints toward the single cell cloning process, where a distinct DNA methylation pattern emerged in each clone³². We therefore asked whether the TI event could influence the epigenetic landscape in a similar way and analyzed the genome-wide methylation profile of 12 clones (6 “sibling clones” same barcode, 6 “relative clones”). Indeed, “sibling clones” cluster closely as compared to “relative clones” (Suppl. Fig. 4a–d). In the analyzed subset most of the differential methylation occurs in intergenic regions and in regions which could not be mapped to defined chromosomes (Suppl. Fig. 4b,c).

Collectively, the data indicate that the majority of observed phenotypic diversity is pre-existing and cell-intrinsic. While some diversity remains within cells sharing the same barcode (“sibling clones”), the majority of phenotypic diversity is explained by the common origin of cells occurring from the identical TI event.

Clonal diversity control

The increased diversity of cells from distinct TI events implies that we can utilize barcodes as a proxy for cellular diversity during the cell line development process. First, we integrated barcode assessment during hit-picking in the limited dilution process and could therefore monitor the cellular origin of clones during the CLD workflow. Second, we designed one group with enriched diversity, i.e. containing only unique barcodes (“relatives clones”), and a second group with decreased diversity, i.e. with many “sibling clones” sharing the identical barcode (Fig. 5a). We hypothesized that the group with enriched barcode diversity would show a higher degree of phenotypic variance as compared to the group with decreased barcode diversity.

We evaluated the cellular diversity of cell clones in ambr15 bioreactors during 14 day fed batch production. Intriguingly and in line with our previous results, key phenotypic parameters as volumetric titer values, glutamate consumption and viable cell density did demonstrate substantially increased variability (Fig. 5b). To identify the best performing clone in a population, the screening burden increases with the amount of diversity. Thus, we next simulated the maximum observed titer when systematically sampling different numbers of cell clones. Specifically, we randomly picked n previously measured titer values from cell clones three times independently and plotted the relationship between screening depth (increased amount of n clones) against the maximum titer achieved within each subset. This allowed us to estimate, whether the increased phenotypic diversity would increase or decrease the amount of clones necessary to include the top producer clones.

We observed that despite a lower average titer in the enriched diversity group, the maximum titer was reached when screening substantially lower number of clones and with higher confidence (lower SD) as compared to the decreased diversity arm (Fig. 5c).

Together, our data indicate that cellular barcodes can serve as a proxy for cellular diversity and can improve CLD screening workflows towards enriched diversity of production-relevant parameters substantially.

SEO Powered Content & PR Distribution. Get Amplified Today.
PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
Source: https://www.nature.com/articles/s41598-024-65323-7

Genomic barcoding for clonal diversity monitoring and control in cell-based complex antibody production – Scientific Reports

Low transcriptome diversity within cell line development workflow

Single-copy targeted barcode integration in stable CHO producer cells

Improving efficiency of limited dilution and alternative proof-of-monoclonality by genetic barcoding

Cells originating from individual RMCE events share cellular phenotypes

Clonal diversity control

Latest Intelligence

EvolutionaryScale Unveils ESM3 Generative AI Model for Advanced Protein Design

Gene variant may underlie diabetes disparities: study

Exsilio debuts with $82M, pitching a ‘leap’ forward for genetic medicine

No more needles – Exvolvo arises with help of Verily and NLC Health

Rare Disease Spotlight: MPS & DMD — a BioCentury podcast

DGIST-POSTECH joint research team developed next-generation

Imagine There is a Medicine That Can Extend Your Pet’s Life…

Valneva gets approval for Chikungunya vaccine in Canada

No assembly required

Obesity drugs from Altimmune, Hengrui show potential; Lilly details Zepbound sleep apnea data