Search
Close this search box.

Quiescence enables unrestricted cell fate in naive embryonic stem cells – Nature Communications

Ethical regulation compliance

The 3–8-week-old female and 2–6-month-old male mice were used for all experiments. Mice were generated and maintained on the 129/S1 background on a 12 h light/dark cycle in the Transgenic Animal Model Core, University of Michigan. All animal procedures were conducted according to the protocols approved by the University Committee on Use and Care of Animals at the University of Michigan (protocol #PRO00004007 and PRO00006455).

Cell lines and cell culture conditions

Mouse ES cell lines (ESCs) used in this study include: E14tg2a (E14) (ATCC, #CRL-1821TM), Oct4-GiP ESCs22, male V6.5 ESCs (Novus Biologicals, #NBP1-41162) and iCdx2 Elf5-2A-mCherry reporter ESCs34. All ESCs were routinely cultured on tissue culture plates coated with Geltrex (Thermo Fisher Scientific, #A1413202, 1:100 dilution from stock) at 37 oC and 5% CO2. Naive ESCs were maintained under 2i/LIF culture condition, which contains N2B27 basal medium supplemented with PD0325901 (SIGMA ALDRICH, #PZ0162, 1 µM), CHIR99021 (SIGMA ALDRICH, #SML1046, 3 µM) and LIF (Millipore, #ESG1106, 1000 U/ml). N2B27 consists of a 1:1 mix of DMEM/F12 (Thermo Fisher Scientific, #21331-046) and Neurobasal (Thermo Fisher Scientific, #21103-049), N2 (Thermo Fisher Scientific, #17502001, 1:100), B27 (Thermo Fisher Scientific, #17504-044, 1:100), 2 mM Glutamine (Thermo Fisher Scientific, #25030024), Penicillin-Streptomycin (Thermo Fisher Scientific, #15140122, 1:100) and 0.055 mM 2-mercaptoethanol (Thermo Fisher Scientific, #21985023). Cells were routinely plated at the density of 1 × 105 cells per well on the six-well tissue culture plates. Cells were passaged every 2 days by dissociation with StemPro Accutase (Gibco, #A1110501). Where indicated, ESCs were grown in serum/LIF culture condition containing the KnockOut™ DMEM medium (Thermo Fisher Scientific, #10829018) supplemented with 15% fetal bovine serum (FBS) (SIGMA ALDRICH, #F0926), 2 mM Glutamine, 1X Non-essential amino acids (Thermo Fisher Scientific, #11140050), 0.055 mM 2-mercaptoethanol, Penicillin-Streptomycin (1:100) and LIF (1000 U/ml).

Mouse embryonic fibroblasts (MEFs) were isolated from E13.5 embryos of FVB triple transgenic mice at the University of Michigan Transgenic Animal Model Core. Mitomycin-C-treated MEFs were maintained in KnockOut™ DMEM medium containing 15% FBS, 2 mM Glutamine, 1X Non-essential amino acids, Penicillin-Streptomycin (1:100) and 0.055 mM 2-mercaptoethanol. MEFs were prepared at the appropriate density (see below) and used within one week after plating.

TSCs were derived directly from Ẻ.3.5 blastocyst as previously described56. TSCs were routinely maintained on top of MEFs (30,000 cells per cm2) at 37 oC and 5% CO2 in the TSC medium containing 70% MEF-conditioned media (R&D systems, #AR005), 30% base medium (RPMI 1640, Thermo Fisher Scientific, #11875093; sodium pyruvate, Thermo Fisher Scientific, #11360070, 1:100; Penicillin-Streptomycin, 1:100; 0.1 mM 2-mercaptoethanol and 20% FBS), 25 ng/ml FGF4 (Reprotech, #100-31) and 1 µg/ml Heparin (SIGMA ALDRICH, #H3149-10KU). Medium change was performed daily, and cells were passaged by dissociation with Trypsin-EDTA 0.05% (Thermo Fisher Scientific, #25300054).

Derivation of Eed-null naive ESCs

Mouse strain JF1 carries a conditional mutation of Eed gene in which its exon 7 was flanked by loxP sequences. The Eedfl/fl mice were generated and maintained on the 129/S1 background as described47. E3.5 embryos were obtained from a cross between Eed heterozygous females and Eed heterozygous males carrying Protamine CRE recombinase. Each E3.5 blastocyst was placed on top of MEF feeder in 2i/LIF culture condition containing the KnockOut™ DMEM base medium, 15% KnockOut™ Serum Replacement (Thermo Fisher #10828028), 5% FBS, 2 mM Glutamine, 1X Non-essential amino acids, 0.055 mM 2-mercaptoethanol and Penicillin-Streptomycin (1:100), PD0325901 (1 µM), CHIR99021 (3 µM) and LIF. At 4–5 days post plating, each inner cell mass was picked, dissociated into single cells and expanded on MEF feeders in the 2i/LIF ESC medium.

Derivation of ESC lines from single low ΔΨm qESC

Each single naive ESC with bottom 5% TMRM fluorescence intensity was FACS-sorted directly onto each well of 96-well tissue culture plate. After 5–7 days, each colony was dissociated into single cell by StemPro Accutase, resuspended into 2i/LIF medium and transferred to 48-well tissue culture plates. Cell lines were routinely cultured in 2i/LIF medium.

Generation of MERVL-2C::EGFP naive ESC lines

About 3 × 104 E14 naive ESCs were seeded on 24-well plates 24 h before transfection. Cells were transfected with the 2C-EGFP plasmid (addgene, #69071)23 for 2 h by LipofectamineTM 2000 (Thermo Fisher Scientific, #11668027). After transfection, fresh medium was added overnight. Transfected cells were dissociated and replated at clonal density in 6-well tissue culture plates. Selection was performed with 250 µg/ml G418 (Thermo Fisher Scientific, #10131035) for one week before individual colonies were manually picked and expanded in 2i/LIF condition.

Generation of mito-Grx1-roGFP2 naive ESC lines and live-cell imaging

The pPBmitoCOX8-Grx1-roGFP2 was generated by PCR amplification of mitoCOX8-Grx1-roGFP2 cassette using the Forward-GGAAGATCTATGTCCGTCCTGACGCCGC; Reverse-CGACGCGTTAACGTTCGAGGTCGACTC primers. PCR products were cloned into BglII/MluI sites in pPBhCMV1cHApA. E14 naive ESCs (2i/LIF) were transfected with 0.5 µg of pPBCAG-rtTAM2-IN, 0.5 µg of pPBmitoCOX8-Grx1-roGFP2 and 1 µg pCAG-hyPBase transposase by LipofectamineTM 2000. Individual colonies were manually picked and expanded under 2i/LIF condition after one-week selection with 250 µg/ml G418. Expression of mitoCOX8-Grx1-roGFP2 in the mitochondria (i.e., mito-Grx1-roGFP2 ESCs) was induced by Dox (1 µg/ml) and analyzed on a Nikon X1 Yokogawa Spinning Disk Confocal (Nikon Instruments Inc.). The 60× objective (water immersion) was used for live-cell imaging. The mito-Grx1-roGFP2 sensor was excited by the 405 nm and 488 nm laser line and detected the single emission (~520) through 500–554 nm bandpass filter.

Generation of induced TSC lines and in vitro differentiation

The naive ESC line with the iCdx2 Elf5-2A-mCherry reporter34 was used to generate induced TSC lines (iTSC). About 400 FACS-sorted cells were plated directly onto MEF feeder (10,000 cells per cm2) in TSC medium. At day 7, mCherry+ cells were FACS-sorted and plated at clonal density for 7 days before individual TSC-like colonies were picked and expanded. The iTSC lines were routinely cultured at high density on MEF feeders (30,000 cells per cm2). To induce differentiation, the iTSCs (after MEF removal) were cultured on gelatin-coated dishes in the absence of FGF4 and Heparin for 6 days before analyses.

CCCP and phenformin treatment of naive ESCs

To inhibit MMP, carbonyl cyanide 3-chlorophenylhydrazone (CCCP, SIGMA ALDRICH, #C2759, 3 µM) was added 24 h after cell plating. All analyses were performed 9 h after CCCP treatment. To inhibit complex I electron transport chain, naive ESCs were treated with phenformin (Selleck Chemical, #S2542, 1 mM) for 2 days. For S-(5′-Adenosyl)-L-methionine chloride dihydrochloride (SIGMA ALDRICH, #A7007, 0.5 mM) supplement, naive ESCs were plated on Lab-Tek II Chamber Coverglass (Thermo Fisher Scientific, #155382). At 24 h post plating, cells were pre-treated with 0.5 mM of SAM or vehicle for 2 h before treatment with CCCP for 6 h. SAM at a final concetration of 25 or 50 µM was used for culturing the spontaneous qESCs for 24 h.

Induction of quiescence

Naive ESCs were induced to achieve a stable quiescent state by inhibition FAO or MYC (FAOi or MYCi). To inhibit FAO, Etomoxir (SIGMA ALDRICH, #E1905, 100 µM) was added at the time of plating for 4 days5. 10058-F4 (SIGMA ALDRICH, #475956) was used at a final concentration of 64 µM for 3 days to inhibit MYC6. As a control, naive ESCs were treated with mTOR inhibitor (mTORi) INK128 (Medchem Express, HY13328, 200 nM) for 3 days as described7. H2O and DMSO were added at the same final volume as FAOi, MYCi or mTORi. The induced qESCs and control cells were then harvested for RT-qPCR, metabolomics analysis and TSC differentiation.

Mitochondrial membrane potential (ΔΨm) staining and FACS sorting

Fractionation of ESCs with high and low ΔΨm levels was performed as previously described5,15. 25 nM of TMRM (Thermo Fisher Scientific, #T668), was used for staining. To visualize ΔΨm profile, naive ESCs were stained with 25 nM of TMRM directly on the tissue culture plate at 37 oC, 5% CO2 for 15 min. Nuclei was stained with Hoechst 33342 (Thermo Fisher Scientific, #H3570, 1:2000). Live-cell imaging was performed using a Nikon X1 Yokogawa Spinning Disk Confocal (Nikon Instruments Inc.) (see below).

Embryoid body (EB) differentiation

The EB medium consists of the KnockOut™ DMEM medium, 15% FBS, 2 mM Glutamine, 1X Non-essential amino acids, 0.055 mM 2-mercaptoethanol and Penicillin-Streptomycin (1:100). 2000 FACS-sorted ESCs with high and low ΔΨm were resuspended into 200 µl EB medium and plated onto each well of the NunclonTM SpheraTM 96U plate (day 0). EBs were cultured in suspension at 37 oC and 5% CO2. Half of medium were replaced on day 1, followed by full medium change every other day from day 2. EBs in suspension at day 3 and 6 were collected for RT-qPCR analysis. At day 7, EBs were transferred to 24-well tissue culture plates coated with Gelatin for additional 7 days. Differentiating cells were then fixed by 4% PFA/PBS for immunostaining to detect ectoderm (TUJ1), mesoderm (SMA) and endoderm (GATA6) markers as described below.

Mathematical simulation modeling for mitochondrial activity

Flow cytometry data of TMRM staining for bulk naive ESCs were used for curve fitting. The distribution of TMRM fluorescence intensity exhibited multimodal features that are captured by a combination of n gaussian distributions using the R package mixtools57. The probability of high ΔΨm and low ΔΨm ESCs were obtained using the Gaussian model. To simulate the dynamics of mitochondrial activity through cell divisions, we constructed a mathematical model based on following assumptions: 1) the mitochondrial activity for the parental cells was generated from a normal distribution with the same mean and standard deviation from the bulk ESCs; 2) the range of mitochondrial activities for the progenies was calculated using the following formula, where a is the activity for each daughter: a = (max(1, a*runif(1,m,1)), a, a*runif(1,1,n)); 3) the activity for each daughter cell had three possible outcomes (I, J, K):

  • I: the activity was decreased by multiplying a by a value between m and 1 and the lowest value was set to be 1; runif(1, m, 1) generated a value from a uniform distribution between m and 1;

  • J: the activity remained at a;

  • K: the activity was increased by multiplying a by a value between 1 and n; runif(1, 1, n) generated a value from a uniform distribution between 1 and n.

Using this model, the experimental data were applied to estimate the activity fluctuation coefficient, t. Most low ΔΨm daughter cells had increased (Pk = 0.8) activity and these cells regained the mitochondrial heterogeneity within 2–3 cell divisions. By comparing the difference in absolute value between the mean activity of daughter and parental cell populations, the activity fluctuation coefficient was estimated as 1.25, which means daughter cells regain similar parental ΔΨm distribution within 3 cell divisions if mitochondrial activity alters ~25% per division.

Gene knockdown by siRNAs

To ensure the complete knockdown, we used the Dharmacon ON-TARGETplus siRNA SMARTPool that contains a mixture of four different siRNAs to target all transcript/splice variants of Mat2a. siRNA transfections were performed with the DharmaFECT 1 Transfection Reagent (Dharmacon) using reverse transfection protocol according to manufacturer’s instructions. Briefly, 2 × 104 naive ESCs per 24-well culture area were transfected with either Dharmacon ON-TARGETplus Mouse Mat2a siRNA SMARTPool or ON-TARGETplus Non-targeting Control Pool at a final concentration of 40 nM. 18 h post transfection, medium change was performed, and cells were maintained for another 2 days prior to sample collection for subsequent analyses.

Cell proliferation and cell cycle analysis

About 5000 FACS-sorted ESCs were replated onto Geltrex-coated 24-well tissue culture plates. Cells were dissociated into single cell by StemPro Accutase, stained with Trypan Blue 0.4% (Thermo Fisher Scientific, #15250061) and counted by hemocytometer every day for 3 consecutive days. For cell cycle analysis, cells were stained with 5 µg of Hoeschst 33342 (Thermo Fisher Scientific, #62249) and 1 µg of Pyronin Y (SIGMA ALDRICH, #213519) as previously described5. Stained cells were centrifuged, resuspended in ice-cold staining solution and subjected to flow cytometry analysis using the Propel Bigfoot Cell Sorter. Data were further analyzed by the FlowJo 10.8.1 (FlowJo LLC).

Elf5 promoter methylation analysis

Genomic DNA was extracted and 0.5 µg of genomic DNA per sample was subjected to bisulfite conversion using the Methylation-GoldTM Kit (ZYMO RESEARCH, #D5005). About 4 µl cDNA was PCR amplified with primers35: Forward-TTTGTAGTTTGAGTATTTTGGTG; Reverse-ACCTTTCCACTCTAAACACCCAAA. Next-generation sequencing was performed at Center for Computational and Integrative Biology (CCIB) DNA Core at Massachusetts General Hospital. The raw paired-end bisulfite sequencing reads were processed using Trim Galore. The refined reads were aligned to the mouse genome (mm10) using Bismark58 with default settings. To determine the methylation levels of individual CpG sites within the regions of interest, the “bismark_methylation_extractor” function was employed to calculate the percentage of methylated CpG over the total calls for each specific site. A robust coverage exceeding 100× was attained for each examined CpG sites. The methylation levels were visualized using a violin plot in the ggplot2 package and a nonparametric Mann-Whitney test was employed for statistical analysis.

Immunofluorescence

Cells were fixed with 4% paraformaldehyde (PFA, Ted Pella Inc. #18505)/PBS for 15 min at room temperature for histone antibodies or 30 min at 4 oC for other antibodies. Cells were then washed three times with PBS, permeabilized in 0.25% Triton X-100/PBS (PBST) for 30 min at 4 oC and blocked in blocking solution (10% Goat serum (Cell Signaling Technology, #5425) or 10% Donkey serum (SIGMA ALDRICH, #D9663), 0.1% BSA and 0.01% Tween20 in PBS) for 1 h at 4 oC. Primary antibody incubation was performed at 4 oC overnight. For detection of 5 mC, fixed cells were permeabilized with 0.4% TritonX-100/PBS for 30 min at 4 oC, denatured with 2 N HCl for 30 min at room temperature, and neutralized with 100 mM Tris-HCl (pH 8.0) for 10 min at room temperature. After incubation with primary antibodies, cells were washed with 0.1% PBST three times prior to secondary antibody incubation at room temperature for 1 h, followed by 0.1% PBST wash for three times. Nuclei were stained with DAPI (Thermo Fisher Scientific, #62248, 1:1000) diluted in 0.1% PBST at room temperature for 10 min. Images were captured by the Leica Stellaris 5 Inverted confocal microscope (Leica Microsystems) with a 40× dry, 60× oil or 100× oil objectives across 0.5 µm stacks. Fluorescence was excited with a 405 nm UV laser (DAPI), a 488 nm laser (Alexa Fluor488) and a 561 nm laser (Alexa Fluor555). Images were also captured by the IX73 microscope system (Olympus). Image data were further processed by either Leica software or ImageJ software59.

Antibodies for IF

The following primary antibodies were used: Mouse anti-OCT3/4 (Santa Cruz Biotechnology, Clone C-10, #sc-5279, 1:50), Rabbit anti-CDX2 (BioGenex, Clone EP25, #NU777-5UC, 1:100), Mouse anti-CDX2 (BioGenex, Clone CDX2-88, #MU392A-5UC, 1:100), Rabbit anti-EOMES (Abcam, #ab23345, 1:100), Mouse anti-SMA (SIGMA ALDRICH, Clone 1A4, #A2547, 1:200), Mouse anti-TUJ1 (Biolegend, Clone TUJ1, #801201, 1:200), Goat anti-GATA6 (R&D Systems, #AF1700, 1:100), Mouse anti-5mC (Millipore, Clone 33D3, #MABE146, 1:200), Rabbit anti-H3K27me3 (Millipore, #07-449, 1:100) and Rabbit anti-H3K9me3 (Abcam, #ab8898, 1:100). The secondary antibodies used were: Goat anti-Mouse IgG FITC-conjugated (Thermo Fisher Scientific, #62-6511, 1:50), Goat anti-Mouse IgG Alexa FluorR 555 (Abcam, #ab150114, 1:1000), Donkey anti-Goat IgG Alexa FluorTM 488 (Thermo Fisher Scientific, #A11055, 1:1000), Donkey anti-Mouse IgG Alexa FluorTM 555 (Thermo Fisher Scientific, #A31570, 1:1000), Donkey anti-Rabbit IgG Alexa FluorTM 555 (Thermo Fisher Scientific, #A31572, 1:1000), and Donkey anti-Rabbit IgG Alexa FluorTM 488 (Thermo Fisher Scientific, #A21206, 1:1000).

Quantitative reverse transcription PCR (RT-qPCR) analysis

Total RNAs were extracted by the RNeasy mini kit. cDNA was synthesized using the SuperScriptTM III First-Strand Synthesis System with oligoT primer (Thermo Fisher Scientific, #18080051). qPCR was performed with the Radiant Green Lo-ROX qPCR Kit (Alkali Scientific Inc., #QS1020) on the 7500 Real Time PCR System (Applied Biosystems). The comparative cycling threshold (ΔΔCt) method was used to analyze the qPCR data. List of primers was shown in the Supplementary Data 8.

Total RNA sequencing

Total RNAs were extracted by the RNeasy mini kit (QIAGEN). Genomic DNA was removed on-column by RNA-free DNase I for 15 min at room temperature. Total RNA amounts were determined by NANODROP 2000/2000c (Thermo Fisher Scientific) and assessed for quality using the TapeStation (Agilent, #5067-5576). Samples were prepared using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (NEB, #E7760L), NEBNext rRNA Depletion Dit (Human/Mouse/Rat) (NEB, #E6310X), NEBNext Multiplex Oligos for Illumina (NEB, #E6440L) and ERCC RNA Spike-In control Mix (Thermo Fisher Scientific, #4456740). About 400 ng of total RNA was ribosomal depleted and fragmented and reverse transcribed. Samples underwent end repair and dA-Tailing step followed by ligation of NEBNext adapters. The libraries were pooled and sequenced on the Illumina NovaSeq 6000 with paired-end 150 bp at the University of Michigan core facility or AZENTA.

Untargeted metabolomics analysis

The untargeted metabolomics analyses were performed as previously described60 with modifications. All samples were subjected to FACS sorting to isolate 5 × 105 live cells. They were lysed using 1 ml of 80% cold methanol. Since the qESCs have lower biomass than cycling ESCs5,6, cell numbers were used to normalize metabolite fractions across samples before metabolite extraction. The extracts were incubated at −80 °C for 10 min and centrifugated at 14,000 rpm for 10 min at 4 °C. The supernatants were speed dried and re-suspended in 35 μl 50:50 MeOH:H2O mixture for LC-MS analysis. LC-MS analyses were performed using an Agilent Technologies Triple Quad 6470 LC-MS/MS system with a 1290 Infinity II LC Flexible Pump (Quaternary Pump), 1290 Infinity II Multisampler, 1290 Infinity II Multicolumn Thermostat with 6 port valve and 6470 triple quad mass spectrometers in both positive and negative acquisition mode as previously described5,60. The Agilent MassHunter Workstation Software was used for data acquisition. Agilent MassHunter Quantitative Analysis for QQQ (Version B.10.1.733.0) was used for initial raw data extraction. Results were then exported as CVS file. Metabolite counts were normalized by the total intensity of all metabolites to scale equal sample loading5. Finally, abundance of each metabolite in the sample was normalized divided by the median abundance levels across all samples for comparisons, statistical analyses and visualizations among metabolites. The statistical analysis was performed by the two-tailed Student’s t test with a significant threshold level of 0.05. Pathway enrichment analysis was performed using the MetaboAnalyst 5.0 (https://www.metaboanalyst.ca)61.

Cleavage under targets and release using nuclease (CUT&RUN)

CUT&RUN analysis was performed with approximately 2 × 105 naive ESCs for each library. Bead-bound cells were incubated overnight at 4 oC with Rabbit anti-H3K4me3 (Millipore, #07-473) or Rabbit anti-H3K27me3 (Millipore, #07-449) antibody diluted in the antibody buffer (EDTA/Digitonin/wash buffer, EMD Millipore, #300410) (1 µg/500 µl). The beads were washed three times and incubated with 200 µl pA-MNase (0.7 ng/µl) for 1 h at 4 oC. 2 mM of CaCl2 was added to activate MNase for 30 min on ice. The reaction was quenched by 2× stop buffer containing 340 mM NaCl, 20 mM EDTA, 4 mM EGTA, 0.02% Digitonin, 50 µg/ml RNase A (QIAGEN, #19101), 50 µg/ml glycogen (Roche, #10901393001) and 2 pg/ml Drosophila spike-in DNA at 37 oC for 10 min. The soluble chromatins were converted to libraries using the NEBNext UltraTM II DNA library Prep kit (NEB, #E7645L). Libraries were subject to pair-ed sequencing on the Illumina NovaSeq 6000 instrument at the University of Michigan core facility.

Assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq)

Approximately 2 × 105 naive ESCs were used for ATAC-seq analysis. Briefly, FACS-sorted cells were washed in cold D-PBS and resuspended in 50 µl of cold lysis buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, and 0.1% IGEPAL CA-630). Samples were spun at 500 g for 10 min, 4 oC. Each cell pellet was resuspended in 50 µl of transposition reaction mix containing 25 µl of 2X TD buffer (Illumina, #FC-121-1030), 2.5 µl Tn5 transposase (Illumina, #FC-121-1030) and 22.5 µl nuclear free H2O. Tagmentation by Tn5 was performed at 37 oC, 30 min. Libraries were sequenced on the Illumina NovaSeq 6000 instrument with paired end reads at the University of Michigan core facility.

Bioinformatic analysis

RNA-seq. To perform data cleaning, the cutadapt (https://cutadapt.readthedocs.org/en/stable/) and FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) tools wrapped in Trim Galore (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) were used to trim low quality bases (Q < 20) and adapter for raw sequences. RNA-seq data were mapped to the mm10 (GRCm38) genome of mouse and ERCC reference using Tophat2 (2.1.1)62. Duplicated reads for pair-end data were removed by SAMtools (v1.9)63. The bigwig files for visualization in Integrative Genomics Viewer (IGV)64 were generated from BAM files by using “bamCoverage” from deepTools3 (v3.2.1)65 with parameters “–outFileFormat bigwig –ignoreForNormalization chrM chrX chrY –skipNAs –bam input.accepted_hits.bam –outFileName output.bigwig –normalizeUsing RPKM –minMappingQuality 30 –binSize 20 –smoothLength 60 –ignoreDuplicates”.

Annotation of mRNA (GRCm38 v99) was downloaded from Ensembl database66. LncRNA annotation was downloaded from NONCODE database67 and repeat annotation was downloaded from Homer (v4.10)68. All three annotations were merged for gene annotation. Fragment (or template) count for each feature was performed by using featureCounts with parameters “-g ‘gene_id’ -p -M -O –fraction -C -F GTF -T 64 -s 0”69. Each reported alignment from a multi-mapping read (identified via “NH” tag) carried a fractional count of 1/x, instead of 1 (one), where x is the total number of alignments reported for the same read. Each overlapping feature received a fractional count of 1/y, where y is the total number of features overlapping with the read. Briefly, each alignment carried a fractional count of 1/(x*y). Fragment counts were transformed to fragments per kilobase of transcript per million (FPKM) by using edgeR70. Genes (or features) have more than three fragment counts in all samples were included. We employed RUVSeq71 to perform differential expression analysis by using ERCC as control genes with a likelihood ratio test from DESeq272. Genes (or features) with the adjusted P value < 0.05 were reported. The following public datasets were used: GSM2711863 (ESC_2i)41, GSM2711865 (ESC_2i_EedKO)41, GSM838738 (MERVL-tdtomato+ and MERVL-tdtomato)19, GSM1966767 (MERVL+_Zscan4+ and MERVL+_Zscan4+)27, GSM3384433 (CRISPRa_EV and CRISPRa_MERVL)29, GSM3110917 (NELFAhigh and NELFAlow)30, GSM1415501 (Zscan4high and Zscan4low)26, GSM2279983 (Dux-GFP+, Dux-GFP, Plus Dox-Dux, and Minus Dox-Dux)28, and GSE66582 (distinct developmental stages of preimplantation embryos)25.

To facilitate the comparisons between our and public RNA-seq data, we followed the “batch mean-centering” approach for batch effect removal73. Briefly, we have separately mean-centered the log2(FPKM + 1) value of each gene by subtracting the mean log2(FPKM + 1) across all samples. We used “1 – Spearman” correlation coefficient as the distance in the hierarchical clustering7. We computed the dissimilarity values with “as.dist” function in R, which were fed into hclust with “average” algorithm for agglomeration. The dendrogram were plotted by R.

CUT&RUN and ATAC-seq

Paired-end raw sequencing reads were trimmed with Trim Galore74 to remove adaptors and low-quality reads. Processed read (>20 bp) were mapped against mouse mm10 reference genome using bowtie2 (v2.4.2)75 with the following parameters “–local –very-sensitive-local –no-unal –no-mixed –no-discordant -q –phred33 -I 10 -X 700 –threads 12”. PCR duplicated reads were removed using samtools (v1.9)63. For H3K4me3, filtered BAM files for each sample were submitted to the callpeak function of MACS2 (v2.2.7.1)76 for peak regions calling with parameters “-p 1e-5 -g mm”. For H3K27me3 domains, peak calling was performed using SICER (v1.1)77 with default parameters for calling broad peaks. ATAC-seq paired-end raw sequencing reads were trimmed with Trim Galore74, then aligned to reference genome (mouse mm10) using Bowtie2 (v2.4.2)75 with parameters: “–local –very-sensitive-local –no-unal –no-mixed –no-discordant -q –phred33 -I 10 -X 2000 –threads 12”. The PCR duplicates from each dataset were removed using the rmdup function of samtools (v1.9)63. Peak calling was done using MACS2 (v2.2.7.1)76 with the parameters: “-B –nomodel –shif 100 –extsize 200”.

Differential binding analyses were performed using the edgeR package DiffBind (v3.4.11)70,78. Peaks derived from individual samples were loaded into dba.count function with minOverlap=1 and summits= FALSE.. The annotatePeak.pl script from Homer (v4.11.1)68 with default settings was employed for annotation. BAM files were merged for the biological replicates and converted to bigwig files for visualization in Integrative Genomics Viewer (IGV)64 by using the command “bamCoververage” of deeptools (v3.5.1)79 with parameters “-ignoreDuplicates -normalUsing RPKM -skipNonCoveredRegions -binSize 20”. Heatmaps were plotted using “computeMatrix” and “plotHeatmap” in deepTools (v3.5.1)79. Violin plots were performed by R package ggplot2.

De novo motifs were identified by the EPIGRAM80 pipeline. For H3K27me3, peaks with higher signals in high ΔΨm group were extracted as the target set and all the other H3K27me3 peaks were treated as the background set. Motifs were identified with following command line “perl epiplay-P.pl h3k27me3.target.bed h3k27me3.backgroup.bed 4 mm10 both 4G both”. TOMTOM (v5.4.1) (https://meme-suite.org/meme/tools/tomtom), an online motif comparison tool from the MEME suite81 was employed to identify the most matched motifs.

To determine the epigenome profiles near MERVL, deepTools computeMatrix was executed in the scale-region mode to calculate the signal density at each MERVL element using following parameters: “computeMatrix scale-regions -m 5000 -a 5000 -b 5000 -S bigwig.files -R bed.files”. For 2C genes, the signal density was also computed with computeMatrix in the scale-region mode in a normalized region 25 kb plus the 5 kb upstream to 5 kb downstream. DeepTools profiler was employed to generate the line plot to display the average values across the region. The violin plot was generated to show the intensity distribution by R package ggplot2, and a nonparametric Mann-Whitney test was used.

Statistical analysis

Statistical analyses were performed by the two-tailed Student’s t test using GraphPad Prism 7.0 and 9.0 software. Data were presented as standard error of the mean (SEM). P values of less than 0.05 were considered statistically significant. GO term enrichment analysis for RNA-seq data was conducted using DAVID (https://david.ncifcrf.gov)82. GO terms with fold enrichment > 2 and the adjusted P value < 0.05 (Benjamini-Hochberg method) were considered as significant enrichment. GO analysis for CUT&RUN peaks was conducted using Panther Classification System Version 17.0 (http://pantherdb.org/). Statistical overrepresentation test was performed with “default settings”. Fisher’s Exact with false discovery rate (FDR < 0.05) multiple test correction was employed. GO terms (FDR < 0.05) for ATAC-seq used GREAT software with the Mouse Phenotype Single KO annotation83. Rules for the region-gene association included 5 kb upstream and 1 kb downstream of each gene.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.