Establishment of SALL4-induced reprogramming system
Previously, we developed a medium known as iCD1, which demonstrated remarkable efficiency in supporting iPSCs reprogramming27. Notably, the addition of BMP4 to iCD1 further supported the OCT4-induced reprogramming10. Building upon these findings and considering the potent role of SALL4 in reprogramming, our hypothesis was that SALL4 alone could reprogram somatic cells into iPSCs when cultivated in a suitable medium. To test this hypothesis, we conducted a compound screening based on iCD1 medium (iCDx) and identified eight molecules that exhibited the capability to drive the reprogramming of MEFs into iPSCs by overexpressing SALL4 through retrovirus infection (Supplementary Fig. 1a–c). Among those compounds, RepSox, an inhibitor of TGF-βR/ALK5, exhibited the most significant effect at 5 μM in concentration (Supplementary Fig.1d). After further optimization, we finally developed a medium, iCD4, which demonstrated effective support for SALL4-mediated iPSCs generation (Fig. 1a, b). During the process of SALL4-induced reprogramming, we observed significant epithelialization on day 4, followed by the appearance of OCT4-GFP+ cells on day 7. By day 10, typical iPSCs colonies were formed at a frequency of approximately 20 colonies per 30,000 cells (Fig. 1b, c). Subsequently, we selected these colonies and maintained them in the KSR-2iLIF medium, where the derived iPSCs exhibited stable passaging and maintained a normal karyotype (Fig. 1d, e). The SALL4-iPSCs demonstrate comparable patterns of pluripotent gene expression to ESCs at both RNA and protein levels (Fig. 1f, g). In addition, transcriptome profiling analysis (Fig. 1h and Supplementary Fig. 1e) confirmed the resemblance of SALL4-iPSCs to ESCs. Subsequent experiments involving teratoma formation (Fig. 1i) and chimeric mouse generation with germline transmission capability (Fig. 1j) further validated the pluripotent nature of SALL4-iPSCs. Furthermore, we obtained OCT4-GFP+ cells using mouse tail tip fibroblasts (TTFs) as starting cells (these cells failed to develop into stable iPSCs lines) (Supplementary Fig. 1f). These findings collectively demonstrate that SALL4 alone has the capability to induce the generation of iPSCs under iCD4 conditions.
To investigate the contribution of the main components of iCD4 in SALL4-induced iPSCs generation, we performed dropout experiments and measured the effect of indicated components. The result revealed that all the components were required for successful SALL4-iPSCs induction. Significantly, within the iCD4 medium, the components Vc, Chir99021, SGC0946, RepSox, and the cytokine bFGF (the absence of bFGF leads a low cytoactivity for MEFs) emerge as particularly crucial (Fig. 1k and Supplementary Fig. 1g). Moreover, in our investigation to validate the impact of Sall4-related reprogramming-enhancing compounds in the OKS-reprogramming process, we conducted OKS-induced reprogramming using iCD4-remove-RepSox medium supplemented with eight molecules respectively. Results revealed that while RepSox slightly inhibits OKS-reprogramming, the other compounds showed no significant effects (Supplementary Fig. 1h, i). Notably, an inhibitory effect on reprogramming was observed using OKS + SALL4 under iCD4 conditions (Supplementary Fig. 1j, k). This suggests the diverse roles of these compounds and genes in various reprogramming methodologies.
SALL4 possesses both the DNA-binding domain and NuRD recruitment domain, which may be critical for its functions. To assess the significance of SALL4’s DNA-binding ability in reprogramming, we created three distinct mutants of SALL4, namely ΔZFC1 (deletion of zinc finger domains cluster 1), ΔZFC2 (deletion of ZFC2), and ΔZFC3 (deletion of ZFC3) (Supplementary Fig. 2a, b). Functional experiments conducted with these mutants revealed their inability to generate iPSCs colonies (Supplementary Fig. 2c). These findings suggest that SALL4’s DNA-binding ability may play a crucial role in mediating the reprogramming process. In addition, earlier studies have indicated that SALL4 recruits a transcriptional repressor, the NuRD complex, to facilitate JGES (Jdp2, Glis1, Esrrb, and Sall4)-mediated reprogramming by targeting specific somatic loci28. However, whether this function is also pertinent in SALL4-driven reprogramming alone remains unknown. To explore this role within the process, we disrupted the NuRD recruitment function by deleting the N-terminal NuRD recruitment domain(ΔN12) of SALL4 (Supplementary Fig. 2a, b). The IP-MS experiment confirmed the defect in the NuRD recruitment ability of the SALL4-ΔN12 mutant (Supplementary Fig. 2d–f). In the reprogramming experiment, there was an acceleration in the emergence of OCT4-GFP-positive cells during SALL4-ΔN12-driven reprogramming (Supplementary Fig. 2c). However, our further experiments revealed defects in the ability to generate a stable iPSCs cell line with these OCT4-GFP-positive cells, as most of the picked GFP-positive cells failed to grow and passage (Supplementary Fig. 2g–j). These results suggest that the NuRD recruitment function of SALL4 may be important for iPSCs formation.
Our subsequent aim was to identify the reprogramming intermediates during SALL4-driven reprogramming. For this, we conducted a time-course FACS analysis utilizing previously reported cell surface markers (Thy1 and Epcam) associated with OKSM-reprogramming intermediates29. The results unveiled a gradual rise in a cluster of THY1–/EPCAM+ cells during reprogramming, with nearly all OCT4-GFP+ cells being EPCAM positive (Supplementary Fig. 3a, b). Subsequently, we isolated these cells at day 7 and induced them using iCD4 medium while keeping unsorted cells as the control group. After a 4-day induction, we observed OCT4-GFP positive cells in both the control group and the THY1–/EPCAM+ cluster. Notably, the THY1–/EPCAM+ cluster demonstrates a relatively higher efficiency in inducing OCT4-GFP+ cells compared to other clusters, although these OCT4-GFP+ cells exhibit limited proliferative capacity in KSR-2iLIF medium (Supplementary Fig. 3c, d). This suggests that the THY1–/EPCAM+ cluster represents the reprogramming intermediates in SALL4-driven reprogramming.
The transcriptome dynamics for SALL4-induced reprogramming
To further investigate the molecular mechanism underlying SALL4-induced reprogramming, we conducted RNA-seq analysis at four time points (Day0, Day4, Day7, Day10) during the reprogramming process mediated by SALL4 (referred to as the SALL4 system) or DsRed (referred to as DsRed system). We included RNA-seq data from ESCs, MEFs, and SALL4-iPSCs as controls (Supplementary Fig. 4a). The PCA plot shows that the reprogramming path in the DsRed system diverged from ESCs, whereas the SALL4 system gradually approached ESCs (Supplementary Fig. 4b). To identify the genes regulated by SALL4 during iPSCs induction, we analyzed differentially expressed gene in the SALL4 system. Using the DsRed system as a reference, we categorized gene changes into two major groups: genes specifically upregulated by SALL4 (C1-C3) and genes specifically downregulated by SALL4 (C4-C6) (Supplementary Fig. 4c). The gradual convergence of gene expression levels in the SALL4 system towards those of ESCs suggests that these genes may play a role in promoting SALL4-mediated reprogramming (Supplementary Fig. 4c). We then conducted GO analysis for the C1 and C6 subgroups to gain insights into the biological processes involving these genes. The C1 subgroup, notably upregulated by SALL4 in comparison to the DsRed system, is linked with biological processes crucial in reprogramming. These processes include epithelial cell morphogenesis, maintenance of stem cell populations, and specification of embryonic patterns (Supplementary Fig. 4d). In contrast, the C6 subgroup is associated with processes related to organ differentiation, including the inflammatory response, nervous system development, lung development, and heart development (Supplementary Fig. 4e). The proper regulation of these biological processes is likely crucial for the transformation of pluripotency during SALL4-induced iPSCs generation. We further conducted GO analysis on subgroups C2, C3, C4, and C5, revealing their enrichment in processes related to the cell cycle and immune system development (Supplementary Fig. 4f, g).
The chromatin binding dynamics of SALL4 during SALL4-mediated reprogramming
SALL4 functions as a nuclear transcription factor that interacts with enhancers and promoters, to regulate transcriptional changes during early embryonic development13,30,31,32. Our mutant experiments demonstrated the importance of SALL4’s zinc finger domain in inducing iPSCs reprogramming (Supplementary Fig. 2c). This suggests that the DNA-binding ability of SALL4 may have effects on reprogramming. Therefore, we aimed to investigate how SALL4 regulates reprogramming by binding to specific genomic loci. To obtain DNA binding data of exogenous SALL4, we performed Cut&Tag using the Flag-tagged SALL4 or SALL4-mutants overexpressed cells (overexpressed by retroviral infection) during the iPSCs induction process, respectively (Fig. 2a and Supplementary Fig. 5a, b). We initially analyzed the genomic distribution of SALL4 binding peaks and observed that only 25% of these peaks were located in promoter regions, while the majority were found in distal intergenic regions and introns (Fig. 2b). This suggests that SALL4 may regulate gene expression not only by binding to gene promoter regions but also by binding to enhancers or silencers in distal intergenic regions and introns. Subsequently, we conducted a Gene Ontology (GO) analysis for these peaks. The outcomes revealed that the genes bound by SALL4 encompass reprogramming-related biological processes, including chromatin remodeling, epithelial cell proliferation, and the maintenance of stem cell populations (Fig. 2c). In addition, we performed a comparison of the binding peaks between SALL4-WT and the mutants, defining the alterations in binding peaks caused by the SALL4 mutants (Supplementary Fig. 5b, c). To identify the genes regulated by SALL4, we compared the genes annotated by SALL4 binding peaks with the genes specifically up / down-regulated in C1 and C6 subgroups. The Venn diagram revealed that 1485 genes were both occupied by SALL4 and exhibited changes in transcription levels, with 507 genes upregulated and 978 genes downregulated (Fig. 2d). GO analysis of these genes showed that upregulated genes were involved in functions such as stem cell population maintenance and epithelial cell development (Fig. 2d). Downregulated genes, on the other hand, were associated with angiogenesis and synapse organization(Fig. 2d).
To further characterize the binding site of SALL4, we performed motif enrichment analysis, revealing that the top seven enriched motifs shared a common TGACTCA sequence (Fig. 2e). These putative SALL4 binding sites were similar to those recognized by transcription factors such as FOS, BATF, FRA1, AP-1, JUNB, ATF3, and FRA2, suggesting potential shared downstream target genes between these factors and SALL4. De novo motif analysis also showed enrichment of the TGACTCA sequence in this process (Supplementary Fig. 5d). To understand the role of these motif-related transcription factors (TFs), we initially examined the RNA-seq data during SALL4-driven iPSCs induction. Our findings revealed that Batf, Fos, and Atf3 showed a slight upregulation during this process, although their overall expression levels remained relatively low. In contrast, Junb, Jun, Fosl1 (Fra1), and Fosl2 (Fra2) exhibited high expression levels and were subsequently downregulated by SALL4 during the early stages (D0-D7) of iPSCs induction (Supplementary Fig. 5e). Based on these findings, we hypothesized that the inhibition of these transcription factors (TFs) by SALL4 might promote reprogramming. To test this hypothesis, we conducted overexpression experiments by retroviral infection to counteract the downregulated expression induced by SALL4. The results demonstrated that the overexpression of Junb, Jun, Fosl1/2, Atf3, and Fos suppressed iPSCs generation, aligning with our expectations (Supplementary Fig. 5f). Interestingly, the overexpression of Batf alongside Sall4 improved reprogramming (Supplementary Fig. 5f, g). These results suggest a potential interaction between SALL4 and BATF in facilitating reprogramming. Despite the RNA-seq data indicating the up-regulation of Batf, we were unable to detect this protein during SALL4-driven reprogramming using western blot analysis (Supplementary Fig. 5h). Based on these findings, we hypothesize that SALL4 binds to the BATF-related loci and regulates these genes to influence reprogramming efficiency. To explore this, we introduced the BATF-DNA binding region into WT SALL4 (SALL4-BATF-B) and created a variant with zinc finger domain deletions (SALL4 mut1/2/3-BATF-B) to enhance the binding affinity to the BATF motif region (Supplementary Fig. 5i). Reprogramming experiments revealed improved induction efficiency using the SALL4-BATF-B fusion protein. Furthermore, the deficiencies resulting from the deletion of SALL4 ZFC1 or ZFC2 were restored by the addition of the BATF-DNA binding region (Supplementary Fig. 5j, k). These results suggest that SALL4 may bind to BATF motif-related genes to promote reprogramming.
SALL4 binds and regulates chromatin accessibility dynamics through direct and indirect effects to promote iPSCs induction
Chromatin remodeling is an essential event during reprogramming. To explore the chromatin accessibility dynamics (CADs) during SALL4-driven reprogramming, we collected ATAC-seq data from the aforementioned four-time points of DsRed and SALL4 systems (Fig. 3a and Supplementary Fig. 4a). We conducted a comparison of peaks at each locus between MEFs and ESCs, categorizing the peaks into three main groups: closed in MEFs but open in ESCs (CO), open in MEFs but closed in ESCs (OC), and open in both MEFs and ESCs (PO). Following this classification, the CO and OC peaks were further segmented into distinct subgroups (OC1-OC5 and CO1-CO5) based on the timing of transition, effectively illustrating the progression of dynamics in chromatin opening and closing (Fig. 3a). We found that the number of peaks in OC1, OC3, OC4, OC5, and CO5 subgroups were different between the SALL4 and DsRed systems (Fig. 3b). The higher number of OC1-4 and CO1-4 peaks and the lower number of OC5 and CO5 peaks in the SALL4 system compared to the DsRed system suggests that the addition of SALL4 increase the transition numbers of ESCs-CADs-related-peaks during reprogramming (Fig. 3b and Supplementary Fig. 6a). Moreover, the Venn diagram analysis of CO1-4, OC1-4, and PO peaks between the DsRed and SALL4 systems revealed 39,959 specific OC peaks and 5028 specific CO peaks induced by SALL4 (Fig. 3c). We conducted statistical analysis on the distribution of peaks in genomic loci and carried out a Gene Ontology (GO) analysis for each CO and OC subgroup. Remarkably, the OC subgroups displayed enrichment in somatic-related processes, such as synapse organization (Supplementary Fig. 6b–d). This finding suggests that SALL4 primarily regulates the dynamics of open-to-close chromatin accessibility to promote reprogramming. To comprehend the regulatory landscape within the SALL4 system, we conducted motif enrichment analysis for the peaks influenced by SALL4. Notably, the peaks observed in the DsRed system were excluded from the SALL4 system. The results showed a significant enrichment of key reprogramming factors in the SALL4 system, including ESRRB, TFAP2C, SOX2, and NKX6.1. Conversely, factors associated with somatic cell characteristics, such as P53, HLF, MEF2D, and HRE, were progressively depleted during the reprogramming process (Fig. 3d). These observations suggest that these factors may play a crucial role in the generation of iPSCs induced by SALL4. Taken together, these findings imply that SALL4 orchestrates the transition of Chromatin Accessibility Dynamics (CADs) toward an embryonic stem cell (ESCs) state during the induction of iPSCs.
For a deeper exploration of the relationship between SALL4 binding and chromatin accessibility dynamics, we compared the SALL4 Cut&Tag peaks and ATAC-seq peaks on day 0 of reprogramming. Approximately 35779 of Cut&Tag peaks were found to overlap with ATAC-seq peaks (C2), with some of these regions (belonging to the OC2-4 subgroups) displaying closure during reprogramming (Supplementary Fig. 7a, b). The distribution of ATAC-seq peaks that overlapped without Cut&Tag peaks (C1) was predominantly observed in OC1, OC2, and CO1 (Supplementary Fig. 7c). Furthermore, we conducted a Gene Ontology (GO) analysis to further elucidate the function of these peaks (Supplementary Fig. 7d). In addition, we also examined the direct or indirect effects of SALL4 on the chromatin regions at various time points. Our findings further indicate that the majority of SALL4 binding congregates within the Open-Close (OC) subclusters. In OC2-4, these chromatin regions are initially occupied by SALL4 and eventually close, suggesting a correlation between CADs and SALL4’s direct binding. Conversely, the OC1 subcluster demonstrates a low level of SALL4 binding, indicating the indirect effects of SALL4 regulation. Regarding Close-Open dynamics, the chromatin regions in CO1 exhibit relatively higher direct SALL4 binding and become open at later stages. (Supplementary Fig. 7e). These results highlight both the direct and indirect effects of SALL4 in reprogramming.
Combining the analysis of these omics data, we observed that the reprogramming-promoting genes such as Oct4, Esrrb, Tfap2c, Rsk1, Lin28a, Tbx3, Kdm2b, Cdh1, Cldn7 and Rcor221,33,34,35,36,37 were opened in SALL4 system (Fig. 3e and Supplementary Fig. 8a). Conversely, genes associated with somatic cell characteristics and known reprogramming suppressors, including Elk3, Cdkn2a, Cdkn2b, Fosl1 and Jun38,39,40 were closed in SALL4 system (Fig. 3e and Supplementary Fig. 8b). Corresponding to these chromatin accessibility changes, the RNA expression levels of the reprogramming-promoting genes were upregulated, while those of the somatic cell-related and reprogramming suppressor genes were downregulated (Supplementary Fig. 8c, d). To further investigate the function of SALL4 downstream target genes (Rsk1, Esrrb, Tfap2c) and the ATAC-motif-related genes (Nkx6.1, Esrrb, Tfap2c) in reprogramming (Fig. 3d, e), we performed overexpression and knockdown experiments by retroviral infection and showed that when Rsk1, Esrrb or Nkx6.1 was co-overexpressed with Sall4, a significant improvement in reprogramming efficiency was observed, and knockdown of Tfap2c and Esrrb defects the reprogramming (Fig. 3f and Supplementary Fig. 8e). Knockdown of Rsk1 also slightly inhibit the reprogramming (Fig. 3f and Supplementary Fig. 8e). These results suggest that the presence of RSK1, ESRRB, TFAP2C and NKX6.1 in SALL4-driven reprogramming could facilitate iPSCs generation. In summary, SALL4 regulates gene expression through direct binding to target genes and indirect regulation, thereby promoting the reprogramming.
SALL4 cooperating with OCT4 improves the iPSCs induction efficiency
OCT4 alone has been identified as a mediator for reprogramming9,10,41, and the activation of endogenous OCT4 is vital for achieving pluripotency42,43,44. Thus, OCT4 plays a critical role in reprogramming. In our study, we investigated the capacity of OCT4 alone and in combination with SALL4 to induce iPSCs under iCD4 medium (Fig. 4a). Our results showed that OCT4 alone could generate iPSCs in iCD4 medium (Fig. 4b, c). Epithelialization was observed on the fourth day of induction in all three conditions (overexpression of Sall4, Oct4, or Sall4 + Oct4 through retrovirus infection). OCT4-GFP+ cells were generated on Day 6 in the Sall4 + Oct4 group, and Sall4 or Oct4 group induced the appearance of OCT4-GFP+ cells on Day7 or Day9, respectively (Fig. 4b). On Day 10, we counted the number of OCT4-GFP+ colonies and found that the efficiency of OCT4-induced reprogramming was lower than that of SALL4 under iCD4 conditions. However, co-overexpression of SALL4 and OCT4 significantly increased the efficiency of somatic cell reprogramming (Fig. 4c). Furthermore, iPSCs obtained from the three methods exhibited stable passaging with pluripotency characteristics (Fig. 4d and Supplementary Fig. 9a–e). Exogenous genes were silenced in OCT4-GFP+ cells, and the transgene genomic integration was confirmed by PCR during iPSCs induction (Supplementary Fig. 9f–h). In addition, using mouse tail fibroblasts as the starting cells, O + S and OCT4 could induce them into OCT4-GFP+ cells (Supplementary Fig. 9i). In summary, our findings indicate that iCD4 can also induce OCT4–mediated reprogramming despite the low efficiency. The co-overexpression of SALL4 and OCT4 significantly improves the efficiency of reprogramming. This suggests that SALL4 can synergistically promote reprogramming with OCT4.
To investigate the molecular mechanism of iPSCs reprogramming mediated by OCT4 and SALL4, we collected transcriptomic data for SALL4, OCT4 and O + S systems, using the DsRed system as a control (Supplementary Fig. 10a). We first performed PCA analysis and showed that each system displayed a unique reprogramming pathway, with the O + S system positioned between the OCT4 and SALL4 systems (Supplementary Fig. 10b). Next we performed gene expression clustering analysis for this three systems. The findings indicated that the activation and silencing of genes during O + S-driven iPSCs generation were relatively more similar to the expression profile of ESCs than those of the other systems. The SALL4 system showed relatively higher similarity to the O + S system than to the OCT4 system (Fig. 4e, f). We also conducted a GO analysis on the upregulated and downregulated genes in each system. Notably, Genes in the UC7 cluster, which were upregulated by both SALL4 and OCT4 + SALL4, were found to be involved in biological processes such as epithelial cell morphogenesis, spermatogenesis, and stem cell proliferation. The genes within the UC10 cluster, commonly upregulated in both the OCT4 and O + S systems, are associated with biological processes related to neuronal differentiation and development. Conversely, genes in the UC19 cluster, specifically upregulated in the O + S system, are linked to biological processes such as DNA methylation involved in gamete generation, spermatogenesis, and maintenance of stem cell populations (Supplementary Fig. 10c). On the other hand, genes in the DC9 cluster, commonly downregulated in both the SALL4 and O + S systems, were found to be involved in biological processes such as positive regulation of neuronal projection development and nervous system development. The genes within the DC12 cluster, commonly downregulated in both the OCT4 and O + S systems, are associated with biological processes related to the negative regulation of neuron apoptotic processes and inflammatory responses. Genes in the DC21 cluster, specifically downregulated in the O + S system, are linked to biological processes such as raft assembly, endocytosis, and integrin-mediated signaling pathway (Supplementary Fig. 10d). In addition, the genes within the UC15 cluster, specifically upregulated in the OCT4 system, are associated with biological processes related to synapse organization. These genes might impede the reprogramming process and can be suppressed by the addition of SALL4 (Fig. 4e and Supplementary Fig. 10e). Overall, our GO analysis suggests that SALL4 could promote epithelial cell formation and inhibit neuron development related genes in O + S system to facilitates iPSCs induction.
Mapping the cell fate transition during OCT4 + SALL4-induced reprogramming by Single-Cell RNA Sequencing
To obtain a more comprehensive understanding of the molecular roadmap associated with SALL4 and SALL4 + OCT4-mediated reprogramming, we performed single-cell RNA sequencing at various time points throughout the SALL4 and SALL4 + OCT4 reprogramming process, specifically collecting samples on days 0, 4, 7, and 10.
We utilized UMAP plots to visualize cell fate transitions in both reprogramming systems, revealing significant changes from day 0 to day 4 (as depicted in Supplementary Fig. 11a, b). The observation of a relatively lower number of Nanog-positive cells in the SALL4 system compared to the O + S system on day 10 further substantiated the cooperative effect of SALL4 and OCT4 during reprogramming. Notably, iPSCs exhibited a closer clustering with ESCs than D10-Nanog positive cells in both systems (as demonstrated in Supplementary Fig. 11a–d), suggests that while most Nanog positive cells emerged at day10 may not fully mature into iPSCs, these cells can achieve maturation when cultured with ESCs maintenance medium. Furthermore, we have also observed the upregulation or downregulation of SALL4-regulated reprogramming-promoting and barrier genes in distinct cell subpopulations during both reprogramming processes (Supplementary Fig. 11c, d).
To further elucidate the trajectory of cellular differentiation during reprogramming, we performed monocle trajectory analysis on days 0, 4, 7, 10 and iPSCs in both systems (Supplementary Fig. 11e, f). The results revealed the emergence of two distinct developmental branches during the reprogramming process, which were not readily discernible in UMAP plotting. We characterized one branch as likely to achieve pluripotency potential (pluripotency branch), based on its alignment with iPSCs-reprogramming directions (Supplementary Fig. 11e, f). Importantly, cells within the pluripotency branch in the O + S system exhibited a more uniform distribution compared to those in the SALL4 system, suggesting that SALL4 and OCT4 collaboratively induce a state of cellular plasticity conducive to acquiring pluripotency more efficiently than SALL4 alone.
In addition, in order to distinguish differences in reprogramming intermediates across various systems, we conducted a comparative analysis of the differential gene expression for THY1-/EPCAM + cells within the SALL4, O + S, and OKS systems. This analysis revealed variations in transcriptional regulations across different reprogramming processes, as indicated by both gene quantity and the functions annotated through GO analysis of differential expression genes (Supplementary Fig. 12a–g).
SALL4 activates Esrrb, Rsk1, and Tfap2c in O + S-mediated iPSCs reprogramming to facilitate induction efficiency
To gain insights into the changes in chromatin accessibility mediated by OCT4 and SALL4 during reprogramming, we compared the ATAC-seq data of the SALL4 system, OCT4 system, and SALL4 + OCT4 system (O + S system) at Day 0, 4, 7, and 10 (Supplementary Fig. 10a). Using CO-OC analysis, we categorized all peaks of the O + S system into 11 groups based on the transition from open to close or from close to open. These groups included persistent open (PO), open to closed (OC1-5), and closed to open (CO1-5) (Fig. 5a). Comparing these systems, we found that the O + S system is more effective than the SALL4 and OCT4 systems in regulating correct chromatin opening and closing, with the SALL4 system being superior performance over the OCT4 system (Fig. 5a). This is consistent with the comparison of their induction efficiency. Subsequently, we conducted motif enrichment analysis in the three reprogramming systems. The results revealed an increase in the abundance of binding motifs for transcription factors known to promote reprogramming, such as OCT4, SOX2, NANOG, TFAP2C, and ESRRB, in the O + S system. Conversely, motifs associated with inhibiting reprogramming, such as CLOCK, and AP1, showed a decrease in abundance (Fig. 5b).
Combining the analyses of the SALL4, OCT4, and O + S systems, we proposed six patterns in which OCT4 and SALL4 cooperatively regulate CADs in the O + S system and defined the peak numbers of each pattern (Supplementary Fig. 13a). These patterns are reflected in chromatin accessibility as follows: (1) Common open in O + S system and SALL4 system (O + S/S4-C-O), (2) Common close in O + S system and SALL4 system (O + S/S4-C-C), (3) Common open in O + S system and OCT4 system (O + S/O4-C-O), (4) Common close in O + S system and OCT4 system (O + S/O4-C-C), (5) only open in O + S system (O + S-S-O), and (6) only close in O + S system (O + S-S-C) (Supplementary Fig. 13a). We further categorized the genes associated with these CADs patterns based on transcription levels using RNA-seq data and revealed numerous changes in gene expression that fit these types of synergistic modes (Fig. 5c, d and Supplementary Fig. 13b–e). We hypothesize that OCT4 and SALL4 improve iPSCs induction efficiency through their regulation of genes within these patterns, where patterns 1, 3, and 5 likely contain genes that promote reprogramming in the O + S system, while patterns 2, 4, and 6 may contain genes that hinder reprogramming. To support this hypothesis, we selected genes based on expression levels, peak enrichment, and functional relevance. Representative potential reprogramming-promoting genes that have been identified include Esrrb, Tfap2c, Rsk1, and Sox2 for their roles in facilitating the induction of iPSCs in the SALL4 and OKS systems. Conversely, genes such as Mndal, Mogat2, and Sbsn were identified as potential barriers to reprogramming due to their somatic-related functions and high levels of expression and peak enrichment (Fig. 5c, d and Supplementary Fig. 13b–e).
We next explore and compare the reprogramming abilities of these representative genes during iPSCs induction. Overexpression of Esrrb and Rsk1 through retroviral infection significantly promotes the generation efficiency of OCT4-GFP+ colonies in the OCT4, SALL4, or O + S systems, Conversely, Mogat2 and Sbsn impair the iPSCs induction for these three systems, while Mndal exerts an inhibitory effect in the O + S systems (Fig. 5e and Supplementary Fig. 13f). Overexpression of Tfap2c or Sox2 by retroviral infection promotes the generation efficiency of OCT4-GFP+ colonies in the OCT4 or O + S systems but has an inhibitory effect on the induction efficiency of the SALL4 system (Fig. 5e).
In summary, OCT4 and SALL4 collaboratively activate genes like Esrrb, Rsk1, Tfap2c, and Sox2 to enhance iPSCs induction and synergistically repress genes such as Mndal, Mogat2, and Sbsn that act as reprogramming barriers within the O + S system. This dual regulation of promotive and inhibitory genes by OCT4 and SALL4 effectively boosts reprogramming efficiency in the O + S system (Fig. 6a).
The chromatin binding dynamics for OCT4 and SALL4 during O + S-induced reprogramming
To explore the cooperative role of SALL4 and OCT4, we conducted Cut&Tag assays for OCT4 during OCT4/O + S-driven reprogramming and for SALL4 during O + S-driven reprogramming on day 0 (Supplementary Fig. 14a). We initially analyzed the genomic distribution of these Cut&Tag peaks and subsequently performed a Gene Ontology (GO) analysis for genes associated with these peaks. The results demonstrated diverse biological processes regulated by SALL4 and OCT4 (Supplementary Fig. 14b, c). We proceeded to compare the SALL4-binding peaks between the SALL4 system and the O + S system, revealing 19,820 new peaks (C2) and 13,466 disappeared peaks (C1) in the O + S system (Supplementary Fig. 14d). Gene Ontology (GO) analysis identified functional roles of genes within these clusters (Supplementary Fig. 14e). To further study the change of the binding pattern in O + S system, we conducted a comparison of the SALL4-binding peaks and OCT4-binding peaks between the SALL4 system and OCT4 system, revealing approximately 6180 peaks that were commonly bound by both SALL4 and OCT4 (C3) (Supplementary Fig. 14f). SALL4 and OCT4 are theoretically capable of commonly binding to these predicted sites to regulate these regions. In actuality, the number of common binding peaks between SALL4 and OCT4 in O + S-driven reprogramming increased to 9769 peaks (C4). When comparing the predicted sites (C3) with the actual binding sites (C4), the results revealed approximately 2418 predicted peaks disappeared (C5), while 6053 peaks emerged (C6) in O + S-driven reprogramming (Supplementary Fig. 14f). Further Gene Ontology (GO) analysis identified the functional roles of genes annotated by these clusters(C5 and C6) (Supplementary Fig. 14g). In addition, the analysis revealed that the proportion of sites near the promoter (< = 1 kb) that increased (C6, 32.62%) and decreased (C5, 41.89%) in the O + S system was notably higher than that observed in the single-factor systems (SALL4 system, 21.43%; OCT4 system, 25.06%) (Fig. 2b and Supplementary Fig. 14b, h). This suggests that the synergistic effect of SALL4 and OCT4 in the O + S system is more inclined towards the regulation of promoter regions.
In addition, we investigated the relationship between factors binding and chromatin accessibility dynamics, we compared the SALL4 and OCT4 Cut&Tag peaks with ATAC-seq peaks in the O + S-system at day 0 (Supplementary Fig. 14i). The results show that most of SALL4 and OCT4 Cut&Tag peaks were overlapped with ATAC-seq peaks in O + S-system, suggest the important role of these DNA binding. The broader binding regions, in cooperation with SALL4 and OCT4, may also regulate more chromatin accessibility dynamics to promote reprogramming (Supplementary Fig. 14i).
To further investigate the regulatory dynamics of SALL4 and OCT4, we analyzed their binding patterns within the O + S system by comparing the genes associated with binding peaks, SALL4 (or OCT4)-ATAC-CO peaks-related genes, and genes specifically upregulated by SALL4 (or OCT4). We identified gene sets that are specifically elevated and exhibit open chromatin due to SALL4 (C1, comprising 26 genes) or OCT4 (C2, comprising 56 genes) influence (Supplementary Fig. 15a, c). Subsequently, we compared these gene sets with the genes associated with binding peaks in the SALL4, OCT4, and O + S systems, respectively. The results illustrate a reduction in the number of SALL4-binding genes within the C1 gene set in the O + S system compared to the SALL4 system alone, suggesting that OCT4’s addition may alter SALL4’s occupancy landscape within the O + S system (Supplementary Fig. 15a, b). In addition, the number of OCT4-binding genes in the C2 gene set also shows a reduction pattern in O + S systems, whereas the number of SALL4-binding genes was significantly larger, suggesting that SALL4’s binding might be related to the downregulation of these genes (Supplementary Fig. 15c, d). This analysis highlights the complex regulatory interplay between SALL4 and OCT4 in modulating gene expression and chromatin accessibility during cellular reprogramming.
- SEO Powered Content & PR Distribution. Get Amplified Today.
- PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
- PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
- PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
- PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
- Source: https://www.nature.com/articles/s41467-024-54924-5