A stepwise mode of TGFβ-SMAD signaling and DNA methylation regulates naïve-to-primed pluripotency and differentiation

Distinct transcriptomes between S2/3DKO and S4KO

To investigate functional contributions of Smad2/3 and Smad4 in mESC, we generated S2/3DKO22 or S4KO using clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9. Clones were selected and verified by sequencing and western blot analysis (Fig. 1A and Supplementary Fig 1A). WT, S2/3DKO and S4KO formed morphologically indistinguishable colonies with comparable alkaline phosphatase (AP) staining, indicating that pluripotency remained unchanged among all of them (Supplementary Fig. 1B). Although mESCs depleted from Smad2/3 or Smad4 showed decreased growth rate in contrast to WT cells, loss of Smad2/3 or Smad4 in mESCs didn’t significantly affect the distribution of cell cycle phases (Supplementary Fig. 1C-E).

Fig. 1: Differential requirements for Smad2/3 and Smad4 during mESC differentiation.
figure 1

A Western blot analysis of Smad2/3 and Smad4 proteins in WT, S2/3DKO, and S4KO cell lines. Tubulin serves as a loading control. B PCA of RNA-seq data generated in WT, S2/3DKO and S4KO in mESC (D0) and EB differentiation 4 days (D4). Samples were prepared in duplicates. C MA plot of RNA-seq datasets of D4 EBs derived from WT, S2/3DKO or S4KO. Genes upregulated (red, log2 fold-change ≥ 1) or downregulated (blue, log2 fold-change ≤ −1) in S2/3DKO or S4KO compared to WT. Two biological replicates for each condition were analyzed. D The up- and down-regulated DEGs were indicated by red and blue bars, respectively. The number of DEGs were marked on the bars. E Heatmap presentation of all DEGs in WT, S2/3DKO or S4KO at D0 or D4 EBs. Genes specifically upregulated only in S2/3DKO (C2) and genes significantly decreased in S2/3DKO (C4) were highlighted. Two biological replicates were analyzed at each condition. F The left dot-line plots showed the average expression patterns of C2 and C4 in (E). Gene Ontology (GO) analysis of genes in C2 and C4 were shown on the right. The top enriched biological processes GO terms with P values were listed. The hypergeometric test (one-side) is used to determine the statistical significance. G Lineage gene expression patterns were analyzed with GSVA of RNA-seq data from D0 to D4 in WT, S2/3DKO and S4KO. ICM, inner cell mass; PrE, primitive endoderm; PreEpi, pre-epiblast; PostEpi, post-epiblast; PS, primitive streak; End, endoderm; Mes, mesoderm; Ect, ectoderm; VE, visceral endoderm. H Heatmap showed epiblast-associated genes in mESC (D0) and D4 EBs of WT, S2/3DKO and S4KO. The color bar represents z-score normalized gene expression. I Heatmap of PS or mesendoderm genes in mESC (D0) and D4 EBs of WT, S2/3DKO and S4KO. The color bar represents z-score normalized gene expression. J qRT-PCR analysis of lineage marker genes expression in mESC (D0), EB D3 and D4 of WT, S2/3DKO and S4KO. The gene expression level is normalized to D0 WT samples. This experiment was repeated 3 times, and representative data are shown here. Data are presented as mean values +/- SD.

Next, transcriptomic profiling of WT, S2/3DKO and S4KO either in mESC condition (day 0, D0) or at day 4 (D4) EBs were carried out using RNA-seq. Principal component analysis (PCA) revealed that the transcriptome of S2/3DKO exhibited striking differences compared to WT and S4KO (Fig. 1B), reminiscent of the differential requirements for Smad2/3 and Smad4 during TGFβ-dependent patterning of early mouse embryos5. 784 genes (469 down and 315 up) at D0 and 1939 genes (913 down and 1026 up) at D4 were identified with significantly changed expression levels (Fold change > 2, P < 0.01) in S2/3DKO compared to WT ESCs. In contrast, only 339 genes (143 down and 196 up) at D0 and 254 genes (75 down and 179 up) at D4 were changed in S4KO compared to WT (Fig. 1C, D and Supplementary Fig. 1F). In D4 EBs, there was more than a 7-fold difference in differentially expressed genes (DEGs) between S2/3DKO and S4KO, although both Smad2/3 and Smad4 are assumed to be equally important in TGFβ signaling transduction.

We wondered which types of genes were differentially affected by the loss of Smad2/3 or Smad4. As shown on the heatmap, these DEGs were divided into 7 clusters (C1-C7) that were further analyzed based on their average expression patterns and by Gene Ontology (GO) analysis (Fig. 1E, F, Supplementary Fig. 1G-H). In C2, genes related to extra-embryonic development, such as placenta development, were upregulated in Smad2/3-deficient EBs compared to WT and S4KO, which was consistent with previous report21 (Fig. 1F, upper panel). As a co-Smad, Smad4 is not only required for TGFβ-Smad2/3 transduction, but also contributing to BMP signaling23. Bmp target genes Id1/2/3 were upregulated in S2/3DKO21, but not in WT and S4KO (Supplementary Fig. 1I). As a result, we postulated that the differential BMP signaling is the reason for the varying activation levels of extra-embryonic genes in S2/3DKO, WT and S4KO. Genes in C4, which are involved in embryonic patterning and development, still increased from D0 to D4 in Smad4-null EBs, albeit to a lesser extent than in WT. But they were almost completely blocked in S2/3DKO EBs (Fig. 1F, lower panel). These divergent phenotypes in C4 raised the possibility that Smad2/3 might regulate TGFβ targets genes independently of Smad4.

Differential requirements for Smad2/3 and Smad4 during mESC differentiation

To identify the specific lineage that was differentially regulated by Smad2/3 or Smad4 during EB differentiation, we conducted a gene set variation analysis (GSVA) to compare our RNA-seq data with published single-cell transcriptome data of EBs24,25 (Fig. 1G). According to the GSVA score, S2/3DKO and S4KO exhibited the most significant differences in the determination of primitive endoderm (PrE, also referred to as hypoblast) and post-implantation epiblast (PostEpi) (Fig. 1G). In line with an earlier study21 and our findings, S2/3DKO promoted the formation of PrE, a predominantly extra-embryonic tissue responsible for giving rise to the yolk sac, which may be associated with the upregulated BMP signaling in S2/3DKO EBs (Fig. 1G and Supplementary Fig. 1I-J). Importantly, genes associated with PostEpi were significantly repressed in S2/3DKO EBs. Conversely, the expression levels of these genes were either unaffected or mildly altered in S4KO compared to WT (Fig. 1H). Arising within the epiblast at the posterior of the embryo, the primitive streak (PS) is established at the onset of gastrulation and will later give rise to the mesendoderm20. In contrast to the differential effect of Smad2/3 and Smad4 in epiblast genes, PS/mesendoderm marker genes were abrogated in both S2/3DKO and S4KO (Fig. 1I).

This observation was further verified by quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR) (Fig. 1J). Pluripotent genes like Esrrb and Oct4, decreased across all samples upon LIF removal (Fig. 1J). The expression of mesendoderm genes, such as Eomes and Wnt3, were completely inhibited in both S2/3DKO and S4KO EBs in contrast to WT (Fig. 1J). This result confirmed the requirement for both Smad2/3 and Smad4 for the formation of PS and mesendoderm, which follows the primed epiblast stage20. Remarkably, the well-known regulators of PostEpi, such as Fgf5, Brachyury (T), Dnmt3a/3b, and Sox3/426,27,28,29,30, increased upon differentiation in WT and S4KO (Fig. 1J). However, these genes were repressed in S2/3DKO EBs, indicating that Smad2/3 is crucial and sufficient for primed epiblast formation, even when Smad4 is absent.

Epiblast forms in the absence of Smad4

The transition from naïve to primed pluripotency can also be achieved by generating EpiLC from mESC31,32. Therefore, mESCs of WT, S2/3DKO and S4KO were transitioned into EpiLCs using the established system (Fig. 2A). When cultured in serum-free N2B27 media supplemented with bFGF and Activin A, WT and S4KO mESCs gradually underwent typical morphological transformation of EpiLCs, including flattening and formation of cellular protrusions. As determined by qRT-PCR, naïve pluripotent genes (Nanog and Oct4) decreased and primed epiblast genes (Fgf5, Dnmt3a, Dnmt3b and T) were induced in WT and S4KO EpiLCs (Fig. 2B). However, S2/3DKO mESC had difficulties in the transition to EpiLC, most of which kept the domed clone forms of mESC and almost didn’t grow (Fig. 2A). At D5 of the transition, very few S2/3DKO cells could emerge from EpiLC-specific bottleneck and survived, and the primed epiblast genes were silenced (Fig. 2A, B). Overexpression of Smad2/3 in S2/3DKO could fully rescue the phenotype and successfully formed EpiLCs with activation of these key epiblast genes (Fig. 2A, B and Supplementary Fig. 2A). This confirms the dominance of Smad2/3 in epiblast formation, aligning with in vivo evidence that the specific inactivation of Smad4 using the Cre-LoxP system in the epiblast has a limited impact on early mouse development5.

Fig. 2: Epiblast occurs in the absence of Smad4.
figure 2

A The scheme showed the procedure of EpiLC transition from mESC (top). Briefly, mESCs turned into EpiLCs after mESCs were shifted from 2i+LIF medium to N2B27 medium with Activin A and bFGF for 6 days. This experiment was repeated 5 times. Representative images showing cell morphology changes during the transition in WT, S4KO, S2/3DKO, and S2/3DKO rescued with Smad2/3 (bottom) are presented here. Scale bar: 100 μm. B mRNA levels of pluripotent genes (Nanog, Oct4) and post-implantation epiblast genes (Fgf5, Dnmt3a/3b, T) during the induction of EpiLCs were analyzed by qRT-PCR. Data are presented as mean values +/- SD. n = 3 independent experiments. P values were calculated by two-way ANOVA test with Geisser-Greenhouse correction. C Venn diagram indicated the overlapping peaks among Smad4 ChIP-seq in WT, Smad2/3 ChIP-seq in WT and Smad2/3 ChIP-seq in S4KO. Genes adjacent to 7926 peaks were annotated, in which primed epiblast genes were marked below. D Heatmap of ChIP-seq tag densities for Smad2/3 and Smad4 were within −1.5/ + 1.5 kb genomic regions surrounding the centers of 11,080 high-confidence Smad2/3 binding sites in S4KO D3 EBs with AC treatment. SB, SB431542, a selective inhibitor of TGFβR; AC, Activin A, an available ligand for Nodal/Activin receptors. E Gene track view of the Fgf5, Dnmt3a, Dnmt3b and T loci. Smad2/3 and Smad4 ChIP-seq were performed in SB- or AC-treated D3 EBs and in D0 non-treated mESCs. Tag densities were normalized to reads per kilobase per million mapped reads (RPKM). The gene structures from RefSeq are schematically represented at the bottom.

Genome-wide binding profile of Smad2/3 without Smad4

Given the observation that Smad2/3 could stimulate epiblast genes even in the absence of Smad4, we sought to investigate the underlying mechanism. Previously, we conducted chromatin immunoprecipitation sequencing (ChIP-seq) of Smad2/3 and Smad4 in ESCs or D3 EBs22. Addition of Activin A (AC) or SB431542 (SB) to EBs either acutely activates or blocks the TGFβ pathway during differentiation process, respectively20,22. Using the same method, we performed Smad2/3 ChIP-seq in S4KO and compared the results with those in WT22 (Fig. 2C–E). In comparison to the Smad2/3 binding peaks observed in WT EBs (32,717 peaks), the overall peak signals of Smad2/3 in Smad4KO EBs (11,080 peaks) were reduced (Fig. 2C). Remarkably, these remaining Smad2/3 binding sites showed a high concordance in S4KO and WT. The binding of Smad2/3 to these loci increased significantly in differentiating EBs stimulated with AC compared to pluripotent ESCs or EBs treated with SB (Fig. 2D). Although the binding pattern is very similar, the overall binding peaks of Smad4 ChIP-seq (13,313 peaks) were not as pronounced as those for Smad2/3, possibly due to differential antibody affinities (Fig. 2C, D).

The disappearance of 4474 Smad2/3 peaks upon Smad4 loss, which were co-bound with Smad4 in WT EBs, strongly suggests a specific requirement for Smad4 at these loci (Fig. 2C). Intriguingly, 7926 Smad2/3 binding peaks remained in the absence of Smad4. Among these, 3787 Smad2/3 peaks were generally co-occupied by both Smad2/3 and Smad4 (Fig. 2C). Transcription factor motif analysis revealed enrichment of binding elements for Smad2, Smad3, and Smad4 in these 7926 peaks (Supplementary Fig. 2B). We annotated genes adjacent to these Smad4-independent peaks and performed GO analysis, which indicated significant enrichment of genes related to embryonic organ development and cell fate commitment. This included epiblast marker genes such as Fgf5, Dnmt3a/3b, T, Sox3/4, Pou3f1 and many others (Fig. 2C and Supplementary Fig. 2C).

The Gene track view showed that in ESCs or SB-treated EBs, there was little to no signal of Smad2/3 or Smad4 on Fgf5, Dnmt3a, Dnmt3b, and T genes. In response to the addition of AC to EBs, these typical epiblast genes were co-occupied by Smad2/3 and Smad4 at promoter and distal enhancer regions (Fig. 2D, E). If Smad4 was removed, Smad2/3 still retained its binding to these epiblast genes (Fig. 2E). Interestingly, the density of most Smad2/3 binding peaks in S4KO is nearly comparable to that in WT. This observation provides an explanation for the largely unchanged expression levels of these epiblast genes in Smad4-null EBs.

Both Smad2/3 and Smad4 are required for mesendoderm differentiation

Since Smad4 is not essential for the transition of mESCs to EpiLCs, we were curious about its importance in epiblast-derived differentiation. A two-step mesendoderm-specific in vitro differentiation model was induced from mESC-derived EpiLCs14 (Supplementary Fig. 2D). Consistent with our findings in the EB models, the loss of either Smad2/3 or Smad4 inhibits the induction of mesendoderm genes such as Wnt3, Eomes and Gsc (Supplementary Fig. 2D). At this stage of mesendoderm differentiation, we conducted cleavage under targets and tagmentation (CUT&Tag) for Smad2/3 in WT and S4KO cells. The binding of Smad2/3 to Wnt3, Eomes, and Gsc drastically decreased with the loss of Smad4 (Supplementary Fig. 2E), indicating the crucial role of the canonical Smad2/3-Smad4 complex in promoting the differentiation of PS from EpiLCs. This is also in line with the known requirement of Smad2/3 and Smad4 during gastrulation33,34. Thus, we postulate that there are two distinct waves of Smads binding and activity. The first wave underscores the key role of Smad2/3 in facilitating the transition from the naïve mESCs to the primed EpiLCs. This is followed by a secondary role of Smad4, as part of the canonical Smad2/3-Smad4 complex conducting TGFβ signaling, in inducing mesendodermal regulators.

Dnmt3b interacts with Smad2/3

To explore the potential mechanism by which Smad2/3 independently regulates epiblast genes, we conducted Smad2/3 co-immunoprecipitation with mass spectrometry (IP-MS) in D3 EBs derived from WT and S4KO, with or without SB or AC treatment. The identified proteins were compared and analyzed in three groups (S4KO vs. WT, S4KO + AC vs. S4KO + SB, S4KO vs. S4KO + SB) (Fig. 3A). Among the 16 proteins enriched in all three comparisons, Dnmt3b gained our attention for several reasons (Fig. 3A and Supplementary Fig. 3A). First, Dnmt3b was ranked on the top five most abundant proteins that was pulled down by Smad2/3 in S4KO and S4KO + AC conditions, but was hardly detectable in WT or S4KO + SB. This suggests that the interaction of Smad2/3 and Dnmt3b is Smad4-independent and TGFβ-dependent. Second, it is well known that Dnmt3b is upregulated to establish DNA methylation during epiblast formation and maturation12,13,14.

Fig. 3: Dnmt3b interacts with Smad2/3.
figure 3

A Venn diagram of Smad2/3 co-IP and Mass spectrometry (IP-MS) analysis. Smad2/3 IP-MS were performed in D3 EBs from WT and S4KO, including non-treated, treated with SB or AC. 16 proteins were identified in all three comparisons (S4KO vs. WT, S4KO + AC vs. S4KO + SB, S4KO vs. S4KO + SB) including Dnmt3b. B Smad2/3 immunoprecipitation in WT and S4KO D3 EBs under SB or AC addition were analyzed by western blot using Dnmt3b, pSmad2/3, Smad2/3 and Smad4 antibodies. This experiment was repeated 3 times. C Western blot analysis of Dnmt3b, pSmad2/3, Smad2/3 and Smad4 for the anti-HA immunoprecipitation in Smad2/3/4TKO cells in which HA tagged Smad2/3 (HA-S2/3) or/and Smad4 (HA-S4) were overexpressed as indicated in the figure. Cells were collected after SB or AC treatment. This experiment was repeated 3 times. D Smad2/3 immunoprecipitation of chromatin-bound proteins in D3 EBs of WT and S4KO with AC treatment were analyzed by western blot using Dnmt3b, Smad2/3, Smad4 and Otx2 antibodies. This experiment was repeated 3 times. E Representative confocal images of PLA experiments on WT, S4KO and S2/3DKO D3 EBs. This experiment was repeated 3 times. Green or gray punctate, positive PLA signal; blue, DAPI. Scale bars, 20 μm. F Overexpression of HA tagged Smad2 (HA-S2) or Smad3 (HA-S3), or/and Flag tagged Dnmt3b (Flag-3b) in 293 T cells were shown in the figure. Western blot of Flag and HA were detected after anti-Flag immunoprecipitation. This experiment was repeated 3 times. G GST pull-down assays using bacterially expressed GST-Dnmt3b or GST-GFP, and 6xHistidine-tagged Smad2 (His-S2, left) or Smad3 (His-S3, right). Elution proteins were analyzed using anti-GST and anti-His antibodies. This experiment was repeated 3 times. H His pull-down assays using bacterially expressed GST-Dnmt3b or GST, and 6xHistidine-tagged Smad2 (His-S2, left) or Smad3 (His-S3, right). Elution proteins were analyzed using anti-GST and anti-His antibodies. This experiment was repeated 3 times.

Next, we repeated the co-IP experiment using WT and S4KO EBs (Fig. 3B). When AC was added to WT EBs, Smad2/3 was phosphorylated (pSmad2/3) and immunoprecipitated with Smad4, but barely with Dnmt3b. In contrast, Dnmt3b was pulled down with endogenous Smad2/3 in S4KO EBs treated with AC, and to a lower degree in S4KO EBs treated with SB. Meanwhile, co-IP was also conducted using the EBs in which HA tagged Smad2/3 (HA-S2/3) or/and Smad4 (HA-S4) were overexpressed in the triple-knockout of Smad2/3 and Smad4 (S2/3/4TKO) (Fig. 3C). When HA-S2/3 or HA-S4 alone was induced in S2/3/4TKO, Dnmt3b was significantly immunoprecipitated with pSmad2/3 when AC is added, but not with Smad4. If both HA-S2/3 and HA-S4 were expressed in S2/3/4TKO, Smad4 competed with Dnmt3b for binding Smad2/3 especially upon AC stimulation (Fig. 3B, C). An alternative explanation for this competing interaction is that EBs without Smad4 are stuck in the pre-differentiation epiblast stage compared to WT, which facilitates the interaction of highly expressed Dnmt3b and pSmad2/3 in nucleus.

To explore whether Smad2/3-Dnmt3b interaction occurs at chromatin, both WT and S4KO EBs with AC addition, were fractionated into cytosol, nucleoplasm, and chromatin (Supplementary Fig. 3B). As a control, Smad2/3 immunoprecipitated Smad4 in WT. Dnmt3b was successfully pulled down with Smad2/3 from chromatin fraction of S4KO (Fig. 3D). Otx2 is known to be highly expressed in the nuclei of epiblast or EpiSC35,36 (Supplementary Fig. 3C). In contrast, Otx2 was not immunoprecipitated with Smad2/3 (Fig. 3D). Meanwhile, the specific Otx2 antibody pulled down itself but failed to precipitate Dnmt3b, Smad2/3 or Smad4 from all the chromatin samples (Supplementary Fig. 3D). Thus, Smad2/3 specifically interacts with Dnmt3b at chromatin.

Moreover, we observed an interaction of Dnmt3b and Smad2/3 in WT, but more prominently in S4KO EBs, utilizing the in situ proximity ligation assay (PLA) (Fig. 3E and Supplementary Fig. 3E). This robust technique allows the detection of two proteins in close vicinity (distance < 40 nm)37. As a control, no positive signal was observed in S2/3DKO EBs with AC addition (Fig. 3E and Supplementary Fig. 3E).

Although Smad2 and Smad3 share a considerable amino acid sequence similarity, they could exhibit distinct functionalities in various biological contexts2,22,38,39. Notably, in mESCs and early mouse embryos, Smad2 surpasses Smad3 in abundance22,40. Consistently, both Smad2 and its phosphorylated form, pSmad2, were more effectively captured by Smad2/3 antibody than their Smad3 counterparts (Fig. 3B, upper bands vs. lower bands). To study the individual role of Smad2 and Smad3, a specific Smad2 or Smad3 antibody was used for co-IP experiment in EBs stimulated with AC (Supplementary Fig. 3F). Smad2 and Smad3 interact with each other. Smad4 was pulled down successfully by Smad2 or Smad3 in WT, but Dnmt3b exhibited a much higher efficiency of immunoprecipitation when bound to Smad2, especially in the S4KO EBs (Supplementary Fig. 3F). In addition, co-IP experiments conducted in Smad2 KO (S2KO) or Smad3 KO (S3KO) cells22 using an anti-Smad2/3 antibody showed that individual Smad2 or Smad3 could pull down Dnmt3b. Their IP efficiencies were also related to their expression levels (Supplementary Fig. 3G). Considering the variance in Smad2 and Smad3 abundance in mESC, we sought to standardize their expression levels. Therefore, we overexpressed Smad2 and Smad3 to comparable levels in S2/3/4TKO cells. Remarkably, under these conditions, both HA-S2 and HA-S3 demonstrated efficient interaction with endogenous Dnmt3b under AC stimulation (Supplementary Fig. 3H). Furthermore, human embryonic kidney 293 T (HEK293T) cells were utilized, overexpressing HA-S2, HA-S3, or/and Flag-tagged Dnmt3b (Flag-Dnmt3b). Co-IP assays revealed that Flag-Dnmt3b efficiently precipitated either HA-S2 or HA-S3 individually, as well as their co-expression (Fig. 3F). Finally, Smad2 and Smad3 were individually expressed as 6XHistidine fusion proteins (His-S2 and His-S3), while GST, GST-tagged GFP (GST-GFP) or Dnmt3b (GST-3b) were induced in E. coli. These proteins were subsequently employed in pull-down assays. GST-3b, but not GST or GST-GFP, was found to interact with His-S2 or His-S3 (Fig. 3G-H). In summary, in mESC, Smad2 is more abundant and interacts more strongly with Dnmt3b. But Smad2 and Smad3 could potentially interact with Dnmt3b at similar levels if their expression levels are similar.

Upregulation of Dnmt3b by Smad2/3 is important for epiblast formation

Next, we checked whether Dnmt3b was required for epiblast gene expression during mESC priming. Indeed, both RNA and protein levels of Dnmt3b gene were significantly upregulated during the epiblast maturation stage in WT or S4KO, encompassing D3-D4 EBs and EpiLCs (Figs. 1J, 2B, 4A, B and Supplementary Fig. 4A). Then, Dnmt3b expression sharply decreased over the course of mesendoderm differentiation from EpiLCs, indicating its reduced involvement in the subsequent differentiation process (Fig. 4B). Notably, Dnmt3b was partially downregulated in S2/3DKO EBs but was mostly blocked in S2/3/4TKO (Figs. 1J, 2B, 4A, and Supplementary Fig. 4A). This phenomenon was more pronounced in the mESC-EpiLC-mesendoderm induction models (Fig. 4B). These observations indicate that both Smad2/3 and Smad4 could regulate Dnmt3b expression, but Smad2/3 is the more dominant one. Dnmt3a and Dnmt3b are closely related in amino acid sequence, exhibiting both redundant and distinctive activities. However, the protein level of Dnmt3a remains constant from EpiLC to mesendoderm stage (Fig. 4B). In addition, Dnmt3a appears to be less affected by Smad2/3, especially in EB differentiation models (Fig. 4A-B and Supplementary Fig. 4A).

Fig. 4: Upregulation of Dnmt3b by Smad2/3 is important for epiblast formation.
figure 4

A Western blot analysis of Dnmt3a, Dnmt3b, Eomes, Smad2/3 and Smad4 protein levels during EB differentiation (D0, D3 and D4) in WT, S4KO, S2/3DKO, and S2/3/4TKO cells. Tubulin serves as a loading control. This experiment was repeated 3 times. B The two-step induction scheme of mESC-EpiLC-ME was presented (top). In brief, after changing into N2B27 medium with Activin A and bFGF for 6 days, EpiLCs were further cultured in N2B27 medium plus CHIR99021 for 24 h to form ME. Western blot showed the expression of Dnmt3a, Dnmt3b, Eomes, Smad4 and Smad2/3 in WT, S4KO and S2/3DKO. Tubulin serves as a loading control. This experiment was repeated 3 times. C qRT-PCR analysis of primed epiblast genes (Dnmt3b, Fgf5) and pluripotent gene Oct4 during EB differentiation (D0, D3) and EpiLC induction (D0, D4, D6) derived from WT, S2/3DKO, S4KO, and S4KO with Dnmt3b shRNA 1 or 2 cells. Data are presented as mean values +/- SD. n = 3 independent experiments. P values were calculated by two-way paired t-test. D Representative confocal microscope images of 5-mC (Red) and DAPI (blue) staining in WT, S4KO and S2/3DKO EBs. Scale bars, 20 μM. This experiment was repeated 3 times. E IGV view showed the WGBS signal profiles of Sox2 locus at ES and EpiLC stages from WT and Dnmt3b knockout (3bKO). The CpG island (purple) and differentially methylated regions (DMRs) (red) were labeled as well. MeDIP-qPCR analysis of DNA methylation levels on Sox2 DMR1 and DMR3 were carried out in D3 EBs derived from WT, S2/3DKO and S4KO. Data are presented as mean values +/- SD. n = 4 independent experiments. P values were calculated by two-sided Mann-Whitney test. F The mRNA level of Sox2 in WT, S2/3DKO, and S4KO cells before and after EB differentiation (D0, D3, D4) were measured by qRT-PCR. Data are presented as mean values +/- SD. n = 4 independent experiments. P values were calculated by two-way ANOVA test.

In contrast, Eomes, a key mesendoderm marker, was highly upregulated in WT during EB and EpiLC-mesendoderm differentiation processes. Its expression was blocked in S4KO and S2/3DKO, suggesting the important role of both Smad2/3 and Smad4 in regulating mesendoderm genes like Eomes (Fig. 4B).

The downregulation of Dnmt3b in S2/3DKO EBs could be one of the reasons why S2/3DKO faced challenges in the primed epiblast transition. Indeed, we observed that overexpression of pLVXI-Dnmt3b in S2/3DKO could upregulate FGF5 and T expression compared to S2/3DKO with empty vector (Supplementary Fig. 4B). Then we knocked down Dnmt3b in WT using two different shRNAs or a combination of both (WT+Dnmt3b-sh1, 2 or 1 + 2). When Dnmt3b was repressed in WT, the epiblast gene Fgf5 was mildly affected in the more heterogeneous EB models but had a greater effect in the more homogeneous EpiLCs (Supplementary Fig. 4C-D). As a control, the pluripotency gene Oct4 was not changed (Supplementary Fig. 4C-D). Interestingly, when Dnmt3b was interfered with the same shRNAs in S4KO EBs or EpiLCs, Fgf5 was significantly downregulated compared to WT or S4KO (Fig. 4C).

All the results suggest that Smad2/3 stimulates Dnmt3b expression during mESC priming process, which in turn promotes the epiblast formation. The need for Dnmt3b is particularly amplified in the context of S4KO. When Dnmt3b is downregulated, resulting in reduced involvement during epiblast-to-mesendoderm differentiation, Smad4 plays an auxiliary role alongside Smad2/3.

Catalytically active Dnmt3b is required for epiblast priming

Dnmt3b is involved in de novo DNA methylation. On one hand, DNA methylation exerts suppressive effects on gene transcription, especially in CpG islands. On the other hand, DNA methylation on gene bodies and enhancers can boost transcription and maintain correct transcription elongation41,42,43,44.To investigate how Dnmt3b mediates Smad2/3 activity in epiblast homeostasis upon Smad4 loss, S4KO+Dnmt3b-sh2 cells were infected with a lentivirus vector expressing a wild-type version of Dnmt3b (Dnmt3bWT), a catalytically inactive Dnmt3b losing DNA methylation ability (Dnmt3bMet-mut) or a PWWP domain mutation of Dnmt3b unable to bind to H3K36me2/3 (Dnmt3bPWWP-mut), respectively (Supplementary Fig. 4E). Dnmt3bWT, but not Dnmt3bMet-mut and Dnmt3bPWWP-mut, efficiently rescued the expression of Fgf5 and T in S4KO+Dnmt3b-sh2 EBs, which is comparable to their expression levels in S4KO EBs with scrambled shRNA (S4KO+Scr-sh) (Supplementary Fig. 4E). As a control, Oct4 expression pattern was not changed under these conditions (Supplementary Fig. 4E). This indicates that both DNA methylation and H3K36me2/3-association abilities of Dnmt3b are required for activating primed epiblast genes in S4KO.

The DNA methylation status of WT, S2/3DKO and S4KO were initially compared. Immunofluorescence imaging analysis via confocal microscopy showed strong 5-mC expression in D3 EBs of WT and S4KO as Dnmt3b levels increased with differentiation. In contrast, S2/3DKO failed to express a high level of 5-mC since Dnmt3b was downregulated (Figs. 4A and 4D). Previous research demonstrated that Sox2 serves as a direct target gene of Dnmt3b in EpiLCs14. Specifically, differentially methylated regions (DMR1-4) located on the Sox2 super-enhancer play a crucial role in guiding EpiLCs towards mesendoderm differentiation14. Published whole genome bisulfite sequencing (WGBS) data showed that these DMRs acquire DNA methylation at EpiLC stage, but become hypomethylated when Dnmt3b is removed (Fig. 4E, upper panel)14. Analyzing DMR1 and DMR3 of Sox2 enhancer at CpG islands during lineage priming, we performed Methylated DNA Immunoprecipitation followed by qPCR (MeDIP-qPCR) and observed a significant reduction in methylation at both loci in S2/3DKO EBs compared to WT and S4KO (Fig. 4E, lower panel). This resulted in Sox2 upregulation in S2/3DKO upon EB differentiation, whereas Sox2 expression decreased as WT and S4KO EBs exited the pluripotent ESC stage (Fig. 4F). Therefore, the timely downregulation of pluripotent factor Sox2, regulated by high level of Dnmt3b, is important for epiblast maturation in WT and S4KO EBs.

Dnmt3b mediates Smad2/3 binding to epiblast genes independently of Smad4

Next, we studied whether Dnmt3b mediates Smad2/3 activity in epiblast formation, particularly when Smad4 is lost. Integrated profiles of DNA methylation levels (WGBS14), Smad2/3 ChIP-seq and histone modifications including ChIP-seq of H3K4me1, H3K4me3 and H3K27ac in EpiLCs14, as well as H3K36me3 ChIP-seq in embryonic day (E) 6.5 epiblast45, were analyzed on epiblast key regulator Fgf5 gene (Fig. 5A). As revealed by WGBS, elevated DNA methylation levels were observed at the gene body (GB) and enhancer regions (+30, +40 and +56 kb) of Fgf5 in mESCs. As Fgf5 expression increased in EpiLCs, Fgf5 GB retained high levels of methylation, and hypomethylated regions were identified at the central area of the proximal promoter (PP) distinguished by H3K4me3 and at active enhancer centers marked by H3K27ac and H3K4me1 (Fig. 5A). It’s worth noting that H3K36me3 exhibited a high degree of enrichment within Fgf5 GB in epiblast, and to a lesser extent, it was present at Fgf5 enhancers, which were further confirmed by ChIP-qPCR (Fig. 5A and Supplementary Fig. 6B). Subsequently, Smad2/3 specifically localized to these valleys of DNA methylation at PP and enhancers, then initiated transcription of Fgf5 during the process of epiblast maturation in both WT and S4KO EBs (Fig. 5A).

Fig. 5: Dnmt3b mediates Smad2/3 binding to epiblast genes independently of Smad4.
figure 5

A IGV view of WGBS, ChIP-seq and CUT&Tag profiles surrounding Fgf5 gene locus. WGBS as well as ChIP-seq of H3K27ac, H3K4me1 and H3K4me3 were performed at ESC or EpiLC stages14, H3K36me3 ChIP-seq was performed in E6.5 epiblast of mice45. Smad2/3 ChIP-seq were done in EBs of WT20 and S4KO (this study), Dnmt3b CUT&Tag of EBs were carried out in WT, S4KO, WT+Dnmt3b-shRNA2 (WT+3b-sh2) and S4KO+Dnmt3b-shRNA2 (S4KO + 3b-sh2), Flag CUT&Tag experiments were performed in Dnmt3a-Flag or Dnmt3b-Flag cells; Dnmt3a or Dnmt3b ChIP-seq of WT and S4KO EBs were shown at the bottom. B Heatmap of tag densities for Flag CUT&Tag in Dnmt3b-Flag, and Smad2/3 ChIP-seq in WT or S4KO, were shown within −1.5/ + 1.5 kb genomic regions surrounding the centers of 68,204 Dnmt3b binding sites in Dnmt3b-Flag EBs, 32,717 or 11,080 Smad2/3 binding sites in WT or S4KO EBs, respectively. C Gene track view of the Fgf5, Dnmt3a, and Dnmt3b loci from Smad2/3 CUT&Tag in EBs of S4KO and S4KO + 3b-sh2. D Heatmap of Smad2/3 CUT&Tag signals in S4KO and S4KO+3bsh2 surrounding 7926 Smad4-independent Smad2/3 peaks. E Peak score plot for the most significant Dnmt3b-dependent Smad2/3 binding sites (fold change > 2.0, P < 0.01). Epiblast marker genes adjacent to the peaks are indicated. Peak score plot for the most significant Dnmt3b-dependent Smad2/3 binding sites (fold change > 2.0, P < 0.01). Epiblast marker genes adjacent to the peaks are indicated. P values were generated by MACS2 callpeak strategy using Poisson test (one-sided) with Benjamini-Hochberg correction. (F)ChIP-qPCR analysis of Smad2/3 and Dnmt3b binding to Fgf5 promoter (Fgf5_PP), gene body (Fgf5_GB) and enhancers (Fgf5_+30k, Fgf5_+40k, and Fgf5_+56k) of D3 EBs treated with SB or AC in WT, S2/3KO, S4KO+ Scr-shRNA, S4KO + 3b-Sh1 or S4KO + 3b-sh2 cell lines. ChIP with anti-IgG serves as a negative control. This experiment was repeated 3 times, and representative data are shown here. Data are presented as mean values +/- SD.

To investigate the mechanistic basis for the role of Dnmt3b in facilitating Smad2/3 function, we tested the possibility that Dnmt3b interacts directly with Smad2/3 on epiblast genes. To explore this, we conducted a series of Dnmt3b CUT&Tag and ChIP-seq experiments during EB differentiation (Fig. 5A). For comparison, we also performed these experiments for Dnmt3a. Besides using a specific Dnmt3a or Dnmt3b antibody, we also employed an anti-Flag antibody to detect the Dnmt3a/3b binding pattern in Dnmt3a- or 3b-Flag cells, in which a 3XFlag epitope tag was inserted into the C-terminus of endogenous Dnmt3a (3a-Flag) or Dnmt3b (3b-Flag) genes by CRISPR/Cas9 technology, respectively (Supplementary Fig. 5A). These clones were sequenced and validated by both western blotting and immunofluorescence (Supplementary Fig. 5A-C). Furthermore, the cells formed EBs correctly and showed upregulation of primed and differentiation genes upon LIF removal, as demonstrated by qRT-PCR (Supplementary Fig. 5D). They also exhibited high levels of 5-mC, comparable to WT EBs (Fig. 4D and Supplementary Fig. 5E).

We first compared Dnmt3b binding profiles across the four assays: CUT&Tag-1, CUT&Tag-2, and ChIP-seq, all were performed with an anti-Dnmt3b antibody, and CUT&Tag-3 used an anti-Flag antibody in Dnmt3b-Flag cells. Notably, CUT&Tag-2 included an E. coli genome spike-in for calibration, which minimized errors related to low input and sequencing biases. In the three CUT&Tag assays, we identified 37,845, 57,886, and 68,204 Dnmt3b peaks (fold change ≥ 4, Q ≤ 0.01), with 57%–67% of annotated peaks overlapping, demonstrating high reproducibility among them (Supplementary Fig. 5F-G). In contrast, the ChIP-seq assay, due to its lower peak-calling performance, identified 41,702 Dnmt3b peaks even with a more lenient fold change cutoff of 3 and a P-value of 0.01 (Supplementary Fig. 5F). Additionally, only about 20% of annotated peaks overlapped between the ChIP-seq and CUT&Tag assays (Supplementary Fig. 5G).

Furthermore, we divided the genome into 1 kb bins and calculated the Pearson correlation (R) of signal intensity between each pair of assays. The R values among the three CUT&Tag datasets ranged from 0.89 to 0.93 (Supplementary Fig. 5H), consistent with a published report46. In contrast, the R values between Dnmt3b CUT&Tag and ChIP-seq assays were slightly lower, ranging from 0.75 to 0.80 (Supplementary Fig. 5H). To exclude potential ATAC-seq-like artifacts from the Dnmt3b CUT&Tag assays, we generated corresponding ATAC-seq data in WT EBs and identified 74,410 peaks using the same criteria as in CUT&Tag (fold change ≥ 4, Q ≤ 0.01, Supplementary Fig. 5F). The correlations between CUT&Tag and ATAC-seq signals, with R values ranging from 0.57 to 0.65, were lower than those observed among Dnmt3b CUT&Tag biological replicates (R = 0.89–0.93) and between CUT&Tag and ChIP-seq (R = 0.75–0.8) (Supplementary Fig. 5H and 5I). Additionally, we analyzed signal distributions across 14,461, 16,684, and 17,170 target genes from three Dnmt3b CUT&Tag assays, 14,751 target genes from the Dnmt3b ChIP-seq assay, and 18,120 target genes from the ATAC-seq assay (Supplementary Fig. 5J). In contrast to the ATAC-seq assay, all Dnmt3b binding profiles of CUT&Tag or ChIP-seq showed greater enrichment in gene body regions, consistent with published data47.

Furthermore, Dnmt3b CUT&Tag and ChIP-seq analyzes showed that Dnmt3b bound to the Fgf5 PP and enhancers, and the binding pattern was highly concordant with Smad2/3 binding genome-wide (Fig. 5A-B and Supplementary Fig. 6A). These binding peaks of Dnmt3b, especially at enhancer and GB regions, were strongly disrupted by Dnmt3b-shRNA (Fig. 5A and Supplementary Fig. 6A). In addition, Dnmt3b binding at the GB region, marked by H3K36me3, of Fgf5 was also confirmed by CUT&Tag and ChIP-qPCR (Fig. 5A and Supplementary Fig. 6B). Although Dnmt3b binding to enhancer and GB is consistent with previous reports44,47, its binding to PP was unexpected (Fig. 5A and Supplementary Fig. 5J), since earlier studies confirmed the absence of Dnmt3b around transcription start sites (TSS)47. At present, we cannot rule out the possibility that this could be potentially caused by biased binding towards nucleosome-depleted euchromatin generated by the CUT&Tag46. In comparison to the one using anti-immunoglobulin G (IgG) as a negative control, ChIP-qPCR showed that the binding of Dnmt3b to Fgf5 was slightly reduced in S2/3DKO compared to its presence in WT and S4KO EBs, probably related to the reduced expression of Dnmt3b in S2/3DKO (Fig. 5F). Addition of SB or AC didn’t affect the binding of Dnmt3b to Fgf5 in all the samples. Remarkably, the bindings of Smad2/3 to Fgf5 depended on AC stimulation in WT and S4KO EBs (Fig. 5F). Moreover, Smad2/3 exhibited a high concordance with Dnmt3b on the PP and enhancer regions of Fgf5, but not in GB (Fig. 5F). As a control, no Smad2/3 signal was detected if ChIP-qPCR was done with anti-IgG in WT or with anti-Smad2/3 in S2/3DKO EBs.

Although both Dnmt3a and Dnmt3b colocalized on Fgf5, the binding signals in GB disclosed with anti-Flag in Dnmt3b-Flag cells became less detectable in Dnmt3a-Flag cells (Fig. 5A). This discrepancy in binding preference was also evident in the ChIP-seq data for Dnmt3b and Dnmt3a (Fig. 5A). Furthermore, as shown on the heatmap, all the binding sites of Dnmt3a-Flag and Dnmt3b-Flag were analyzed and divided into four clusters (C1-C4) (Supplementary Fig. 6C). Epiblast genes were highly enriched in C1 and C2, while in C3 and C4, Dnmt3a and Dnmt3b showed different binding abilities (Supplementary Fig. 6C). Collectively, these findings indicate the overlapping yet distinct functions of Dnmt3a and Dnmt3b.

Since we found that Dnmt3b and Smad2/3 converge on cis-regulatory elements of epiblast specification genes, we wondered whether Dnmt3b enables Smad2/3 binding to these genes. Therefore, Smad2/3 CUT&Tag was performed in S4KO EBs with either Scr-sh or 3b-sh2 (Fig. 5C). The bindings of Smad2/3 to sites in Fgf5, Dnmt3b, and Dnmt3a in S4KO EBs were consistent with the observations in Smad2/3 ChIP-seq (Figs. 2E and 5C). However, Dnmt3b depletion inhibited Smad2/3 binding to a subset of these sites in the epiblast genes (Fig. 5C). Moreover, 7926 Smad2/3 binding peaks were strongly diminished in S4KO + 3b-sh2 compared to those in S4KO+Scr-sh (Figs. 2C and 5D). These Dnmt3b-dependent Smad2/3 binding peaks were analyzed, revealing a high enrichment of epiblast specification genes (Fig. 5E). Consistently, ChIP-qPCR confirmed that the binding of Dnmt3b to Fgf5 was reduced in S4KO + 3b-sh1 or 2 compared to S4KO+Scr-sh EBs, which subsequently led to fewer Smad2/3 molecules being able to access the Fgf5 gene (Fig. 5F). As a control, Dnmt3b shRNA had a mild effect on Smad2/3 binding to Fgf5 in WT EBs (Supplementary Fig. 6D). Taken together, Dnmt3b potentially facilitates Smad2/3 binding to epiblast genes including Fgf5, and this requirement is particularly amplified in the context of S4KO.