Close this search box.

The PTM profiling of CTCF reveals the regulation of 3D chromatin structure by O-GlcNAcylation – Nature Communications

Cell culture

ESC lines R174 was used in this study and cultured in gelatin-coated dish with ESC medium, which consisted of DMEM (Hyclone, #SH30022.01), 15% (v/v) fetal bovine serum (FBS, Lonsera, #S712-012S), 0.1 mM β-mercaptoethanol (Sigma, #M6250), 2 mM L-glutamine (Thermo Fisher, #35050061), 0.1 mM nonessential amino acids (Thermo Fisher, #11140050), 1% (v/v) nucleoside mix (Sigma), 1000 U/mL recombinant LIF (Millipore).

Mouse CTCF-EGFP-AID ESCs were presented by Bruneau lab14 and cultured in gelatin-coated dish with ESC medium. For the depletion of CTCF, the cells were cultured in ESC medium with 0.5 mM auxin (Sigma, #I5148-2g) for 2 days. To establish the T668A glycosylated mutant CTCF (MUT_CTCF) and WT CTCF (WT_CTCF) in CTCF-EGFP-AID ESCs, whose exogenous CTCF was in similar level with endogenous CTCF in CTCF-EGFP-AID ESCs, the fragment of MUT_CTCF and WT_CTCF was subcloned into pJD10575 and the Fugene HD (Promega, #E2311) was used for transfection and the cells were selected by Hygromycin B (Thermo Fisher, #10687010). Cell clones were picked and then cultured in ESC medium with 0.5 mM auxin for 2 days to degrade endogenous CTCF. The clone whose expression level of exogenous CTCF was similar with endogenous was identified by anti-CTCF antibody and was used in the functional experiments.

To knockdown Ogt, we used lipofectamineTM2000 for siRNA transfection. Gene expression was detected through RT-qPCR after 72 h. The cells with the highest knockdown efficiency were used for downstream experiments.

NPC differentiation

The cell line was cultured in N2B27 medium which consisted of a 1:1 mixture of DMEM supplemented with 1× N2 (Gibco, #17502048), NEAA, 1 mM L-glutamine, and 0.1 mM β-mercaptoethanol with Neurobasal (Thermo Fisher, #21103049) supplemented with B27 (Gibco, #17504044). After 7 days, the cells were harvested and the total mRNA was extracted for RT-qPCR.


Antibodies used in this study were anti-CTCF (Active Motif, #61311), anti-CTCF (Abclonal, #A19588), anti-RL2 (abcam, #ab2739), anti-OGT (GeneTex, # GTX109939), anti-Mono-Methyl Arginine (MMA) antibody (CST, #8015), anti-Symmetric Di-Methyl Arginine (SDMA) antibody (PTMBIO, #PTM-617RM), anti-Asymmetric Di-Methyl Arginine (ADMA) antibody (PTMBIO, #PTM-605RM), anti-Acetyllysine antibody (PTMBIO, #PTM-105RM), anti-GAPDH (Abclonal, #AC001), anti-OCT4 (Santa, #sc-5279), anti-SOX2 (Santa, #sc-36582-3), anti-NANOG (Bethyl, #A300-397A), anti-H3 (Santa, #sc-17576). Goat anti-rabbit-IgG (H + L)–HRP (CST, #7074s), and goat anti-mouse-IgG (H + L)-HRP (CST, #7076s) were used as secondary antibodies for western blotting.

The purification of CTCF

A total of 50 million cells were harvested by trypsinization, washed with cold PBS, and frozen in liquid nitrogen. Whole-cell pellets were lysed in lysis buffer (50 mM HEPES pH = 7.6, 250 mM NaCl, 0.1% NP-40, 0.2 mM EDTA, 0.2 mM PMSF, 1× protease inhibitor cocktail) on ice. The lysates were cleared by centrifugation at 16,000 × g for 10 min at 4 °C and the supernatant was protein extraction. Antibody-based purification was performed to detect the PTMs of endogenous CTCF. Briefly, CTCF antibody was conjugated with Protein G agarose (Roche, #11243233001) by incubating in IP DNP buffer (20 mM HEPES pH = 7.6, 0.2 mM EDTA, 1.5 mM MgCl2, 100 mM KCl, 20% glycerol, 0.02% NP-40, 1× protease inhibitor cocktail) overnight at 4 °C and washed twice. Then, protein extraction was added to the Protein G agarose and rotated at 4 °C. After 12 h, the supernatant was removed and the antibody-conjugated Protein G agarose was washed twice with IP DNP buffer. 2× SDS loading buffer was added to the antibody-conjugated Protein G agarose and the protein was eluted by boiling 5 min at 95 °C. SDS-PAGE analysis was applied to assess purification efficiency and the remaining protein was kept in −80 °C for mass spectrometric and western blot.

Mass spectrometric analysis of CTCF PTMs

Coomassie-stained SDS-PAGE gel band was excised and incubated with 200 μL of 100 mM ammonium bicarbonate with 25% ACN at 37 °C. After 30 min, the supernatant was removed and the gel slices was incubated with 10 mM DTT solution at 60 °C for 30 min. The DTT solution was removed and 100 mM IAA solution was added and incubated sample at 37 °C for 15 min in the dark with shaking. After then, IAA solution was removed and 100 mM ammonium bicarbonate/25% ACN was added to rinse gel slices. In order to digest the gel, 500 μL of ACN was added to shrink gel pieces and was removed after 15 min. 100 mM ammonium bicarbonate was added to recover gel and 1 mg/mL trypsin stock solution was added in the sample to digest overnight at 37 °C. The gel pieces were extracted three times by adding 50 μL of 25% ACN/0.1% TFA solution and incubated at 37 °C for 5–15 min. The supernatant was transferred to a new tube and the peptides were completely dried under vacuum. The peptides were desalted using reverse-phase solid-phase extraction cartridges (Sep-Pak C-18), completely dried under vacuum, and were resuspended in 0.1% formic acid before LC–MS/MS analysis. The mass spectrometry experiments were repeated twice and the coverage of the protein was 68.26% and 65.33%, respectively. Only modifications identified in both mass spectrometry experiments were considered as modifications identified for CTCF.

Enrichment of O-GlcNAcylated, methylated, and acetylated proteins

25 million cells were individually used for the enrichment of O-GlcNAcylated, methylated, and acetylated proteins and protein extraction was the same as above and denatured by boiling for 10 min at 95 °C. For enrichment of O-GlcNAcylated protein, WGA and RL2 purifications were carried out as described elsewhere76. Denatured proteins were incubated overnight at 4 °C with agarose-bound WGA resin (Vector Laboratories, #AL-1023S) or with RL2-conjugated Protein G agarose. The agarose was then washed twice with lysis buffer and the eluted proteins were analyzed by western blot. For enrichment of methylated proteins, anti-Mono-Methyl Arginine (MMA) antibody, anti-Symmetric Di-Methyl Arginine (SDMA) antibody and anti-Asymmetric Di-Methyl Arginine (ADMA) antibody were used respectively as above. For enrichment of acetylated proteins, anti-Acetyllysine antibody was used as above. Agarose without WGA or IgG-conjugated agarose were used as control.

Identification of phosphorylation modification

2 × 107 cells were collected for protein extraction and proteins were digested into peptides. Phosphorylated peptides were enriched and identified by mass spectrometry according to ref. 77.

The treatment of β-N-Acetyl-hexosaminidase (β-hex), and periodate oxidized adenosine (AdOx)

The treatment of β-N-Acetyl-hexosaminidase (β-hex, NEB, #P0721S) was performed according to the manufacturer’s instructions, and 4 μg purified CTCF was used. For the treatment of periodate oxidized adenosine (AdOx, TargetMol, #T22231), mESCs were treated with 30 μM AdOx and DMSO for 36 h and harvested for the purification of CTCF.


10 million cells were collected and nuclear extracts were prepared from mESCs as described78. Endogenous CTCF was immunoprecipitated with 5 µg of CTCF antibody pre-bound to Protein G agarose and co-immunoprecipitated OGT was identified by western blot with the antibody of OGT. The immunoprecipitation of OGT was done as the same.

Tissue protein extraction

Tissue protein used in Fig. 2e was generously provided by Dr. Ma from Peking University. The tissue was dissected on ice in PBS/10% FBS. Each tissue was separated and placed to separate tubes using fine-pointed forceps. The dissected tissues were weighed, and then washed with PBS on ice. A ratio of ~1 g of tissue to 20 mL T-PER Reagent (Thermo, #78510) was added to the tissue sample. After homogenizing, the tissues were centrifuged for 5 min at 10,000 × g. The supernatant was collected and the protein was quantified by BCA Protein Assay Kit. For the purification of CTCF, each tissue sample weighing 200 mg was used for protein extraction and 500 µg protein was used as total input to purify CTCF.

Western blot

Samples were electrophoresed on SDS-PAGE gels and transferred to PVDF membranes (BIO-RAD, #1620177), and the membranes were blocked in 5% BSA at room temperature. After 1 h, the membranes were incubated with the primary antibody overnight at 4 °C, washed three times, and incubated with peroxidase-labeled secondary antibody for 1 h at room temperature. After three washes with TBST, bands were visualized with ECL substrate (BIO-RAD, #1705061) and imaged with a CCD camera. All uncropped and unprocessed scans of blots were provided in the Source Data file.


Cells were grown on gelatin-coated glass for 24 h and then fixed in 4% paraformaldehyde (PFA, Solarbio, #P1110) for 15 min. After washing with PBS, cells were permeabilized with 0.25% Triton X-100 (AMERSCO, #0694-1L) and blocked with 10% bovine serum albumin. After 1 h, cells were incubated with CTCF antibodies in 3% BSA overnight at 2–8 °C. After washing 3 times with PBS, cells were incubated with secondary antibodies for 1 h. The fixed cells were imaged using confocal microscopy.


Cells were crosslinked in 1% formaldehyde (Sigma, #F8775) for 10 min at room temperature and quenched with 125 mM glycine (Sigma, #G7126) for 5 min. After then, the cells were collected and incubated in lysis buffer I (50 mM HEPES-KOH, pH = 7.5, 140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% NP-40, 0.25% Triton X-100, protease inhibitors). After 10 min, the cells were collected, resuspended in lysis buffer II (10 mM Tris-HCl, pH = 8.0, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, protease inhibitors), and rotated for 10 min. For sonication, the cells were collected, and resuspended in sonication buffer (20 mM Tris-HCl pH = 8.0, 150 mM NaCl, 2 mM EDTA pH = 8.0, 0.1% SDS, and 1% Triton X-100, protease inhibitors). Sonicated lysates were cleared once by centrifugation at 16,000 × g for 10 min at 4 °C and the supernatant was transferred to 15 ml conical tube. Spike-in Drosophila chromatin (Active Motif, #53083) was added to the supernatant and 50 μL of mixture was saved as input. 10 mg of anti-CTCF (Active Motif, #61311) together with 4 mg spike-in antibody (Active motif, #61686) was added alongside with beads. The remainder of the mixture was incubated with magnetic beads bound with antibody to enrich for DNA fragments overnight at 4 °C. The next day, beads were washed with wash buffer (50 mM HEPES-KOH pH = 7.5, 500 mM LiCl, 1 mM EDTA pH = 8.0, 0.7% Na-Deoxycholate, 1% NP-40) and followed with TE buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA, 50 mM NaCl). Beads were removed by incubation at 65 °C for 30 min in elution buffer (50 mM Tris-HCl pH = 8.0, 10 mM EDTA, 1% SDS), and supernatant was reverse crosslinked overnight at 65 °C. To purify eluted DNA, 200 μL TE was added to dilute SDS, and 8 μL 10 mg/ml RNase A (Thermo Fisher, #EN0531) was added to degrade RNA. After 2 h, protein was degraded by addition of 4 μL 20 mg/ml proteinase K (Thermo Fisher, #25530049) and incubation at 55 °C for 2 h. Phenol: chloroform: isoamyl alcohol extraction (G-CLONE, #EX0128) was performed followed by an ethanol precipitation. The DNA pellet was then resuspended in 50 μL TE. Library was performed with NEBNext Ultra II DNA library kit (NEB, #E7645). Two biological replicates were performed for each cell line.


Cells were collected, washed with PBS, and incubated in lysis buffer (10 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl2, 0.5% NP-40) for 10 min at 4 °C. After 5 min of centrifugation, TruePrepTM DNA Library Prep Kit V2 for Illumina® (Vazyme, #TD501) was used to make DNA fragmentation. After 30 min, 100 μL VAHTS DNA Clean Beads (Vazyme, #N411) were added to the sample. Then, DNA was collected with a magnet and washed with 80% ethanol. H2O was added to elute the DNA and library preparation was performed with TruePrepTM Index Kit V2 for Illumina® (Vazyme, #TD202). Libraries were amplified for 12–15 cycles and were size-selected with VAHTS DNA Clean Beads (Vazyme, #N411). Two biological replicates were performed for each cell line.

In situ Hi-C

In situ Hi-C was performed as in Rao et al.3 with some modifications. Cells were crosslinked as above. Cells were incubated in lysis buffer (10 mM Tris-HCl pH8.0, 10 mM NaCl, 0.2% Igepal CA630, protease inhibitors cocktail) for 15 min at 4 °C and washed twice. Cells were collected and incubated in 50 μL of 0.5% SDS at 65 °C for 8 min. Then 25 μL 10% Triton-X and 145 μL H2O were added to quenched SDS. To digest chromatin, 25 μL of 10× NEBuffer2 and 20 μL of MboI (New England Biolabs, #R0147) were added and incubated overnight at 37 °C. The next day, restriction fragments were biotinylated by supplementing the reaction with 37.5 μL biotin-14-dATP (Life Technologies, #19524016), 1.5 μL of 10 mM dCTP (Invitrogen, #18253013), 1.5 μL of 10 mM dGTP (Invitrogen, #18254011), 1.5 μL of 10 mM dTTP (Invitrogen, #18255018), 8 μL of DNA polymerase I, large (Klenow) fragment (New England Biolabs, #M0210) and incubated at 37 °C for 4 h. The end-repaired chromatin was added into 663 μL H2O, 120 μL NEB T4 ligase buffer, 100 μL 10% Triton-X-100, 12 μL 10 mg/mL BSA, 5 μL T4 DNA ligase (New England Biolabs, #M0202) and incubated for 4 h at room temperature. To reverse crosslink, 50 μL of 20 mg/ml proteinase K, 120 μL 10% SDS and 130 μL 5 M NaCl were added in order and incubated at 68 °C overnight. The DNA was precipitated with ethanol and resuspended in 130 μL of Tris-buffer (10 mM Tris-HCl, pH = 8.0) for sonication. Then the liquid volume is replenished to 300 μL. To isolate biotin-labeled ligation junctions, 150 μL of 10 mg/ml Dynabeads MyOne Streptavidin T1 beads (Life technologies, #65602) were washed with 400 μL of 1× Tween washing buffer (5 mM Tris-HCl pH = 7.5, 0.5 mM EDTA, 1 M NaCl, 0.05% Tween 20), collected with a magnet, resuspended in 300 μL of 2× binding buffer (10 mM Tris-HCl pH = 7.5, 1 mM EDTA, 2 M NaCl) and added to the sample. After 45 min, biotinylated DNA was bound to the beads. To remove end-repair and biotin from unligated ends, 88 μL 1× NEB T4 DNA ligase buffer, 2 μL dNTPs, 5 μL T4 PNK (New England Biolabs, #M0201), 4 μL T4 DNA polymerase I (New England Biolabs, #M0203), 1 μL DNA polymerase I, large fragment (Klenow) were added to incubate for 30 min at room temperature. The beads were washed twice in 1× Tween Wash Buffer 2 min at 55 °C. A-tailing was performed by incubating in 90 μL 1× NEB buffer 2, 5 μL dATP, 5 μL Klenow exo at 37 °C for 30 min. The adapter was ligated by incubating in a mixture of 50 μL 1× Quick Ligation Buffer, 2 μL Quick Ligase (New England Biolabs, #M2200), and 3 μL Illumina indexed adapter (New England Biolabs, #E7337) for 15 min at room temperature and followed by adding 2.5 μL User Enzyme. Library was performed with a NEBNext DNA Library Prep Kit and amplified for 10–12 cycles and were size-selected with AMPure XP beads (Beckman Coulter, #A63881). Three biological replicates were performed for each cell line.

RNA isolation and RNA-seq

Cell pellets were homogenized in RNAzol reagent (MRC, #RN190-500) and processed according to the manufacturer’s instructions. Then the total mRNA was extracted for sequencing. Two biological replicates were performed for each cell line.

Flow cytometry analysis

Single-cell suspensions were prepared and were fixed. The Foxp3/Transcription Factor Staining Buffer Set (eBioscience, #00-5523-00) was used according to the manufacturer’s instructions to detect the expression of CTCF. For cell cycle analysis, two days after treatment of auxin, 1000 WT_CTCF, and MUT_CTCF mESCs were seeded into individual wells of a six-well plate. After 4 days, the cells were collected, fixed in 75% ethanol, and washed three times. DAPI (Sigma, #10236276001) was used for staining and cells were analyzed by flow cytometry. Experiments were conducted in three independent triplicates.

Cell viability assay

The cell counting kit 8 (CCK8) (DOJINDO, #CK04) was used according to the manufacturer’s instructions. Experiments were conducted in three independent triplicates.

Colony formation assay

Two days after treatment of auxin, 1000 WT_CTCF and MUT_CTCF mESCs were seeded into individual wells of a six-well plate. After 5 days, the colonies were stained by alkaline phosphate (AP, Yeasen, #40749ES60). Colonies of undifferentiated cells (UD), partially differentiated cells (PD), and differentiated cells (D) in each well were counted. Experiments were conducted in three independent triplicates.

EB differentiation assay

105 WT_CTCF and MUT_CTCF were cultured in standard ESC medium without LIF. Images were taken at day 2, 4, 8, and 14.

Prediction of protein–protein interaction

Predicted protein structures of CTCF (Q61164) and OGT (Q8CGY8) are obtained from Alphafold Protein Structure Database. Protein–protein interactions are predicted by ZDOCK Server49 ( and visualized by PyMol.

Quality control of sequencing reads

All the Illumina sequencing reads used in the study were firstly quality controlled by Trim Galore. In detail, we removed the bases with quality below 20 and the adapter sequences from the 3′ end, and filtered the reads with length less than 50 nt.

RNA-seq data analysis

RNA-seq reads were aligned to mm9 reference genome using STAR79 with default parameters. The uniquely mapped reads were counted with HTSeq-count. We detected the differentially expressed genes using edgeR80. Genes were considered differentially expressed when the p-value < 0.05 and the fold change is above 2.0. GSEA analysis was performed by GSEA_Linux_4.0.381.

ChIP-seq and ATAC-seq data analysis

ChIP-seq and ATAC-seq reads were aligned to mm9 reference genome using Bowtie282 with default parameters, followed by removing the multiple aligned reads, PCR duplications with SAMtools. Alignment track bigwig files were generated using deepTools83. ChIP-seq and ATAC-seq profiling plot were generated by deepTools83.

ChIP-seq data were aligned to BDGP6 reference genome, followed by removing the multiple aligned reads. Spike-in reads were counted and scale-factors between samples were determined for downstream analyses. For differential binding analysis, we first called CTCF peak by macs2 peak calling pipeline ( with multiplication of scale factors to generate bedGraph files. We then concatenated WT_CTCF and MUT_CTCF CTCF peaks to get full set of CTCF peaks, and then used bedtools multicov for reads counting on all CTCF peaks. Read counts were then normalized by scale factors and R package DESeq2 was used to perform differential analysis.

In situ Hi-C data analysis

Hi-C reads were processed using Hi-C-Pro84 pipeline: reads were aligned to mm9 reference genome using bowtie2, reads with mapping quality >10 were assigned to MboI restriction fragments, and interaction pairs were reconstructed. Singleton or multi-hits pairs were filtered out, followed by removal of failed ligation products (dangling end pairs, re-ligation pairs, self-cycle pairs) and pairs not able to reconstruct the ligation product. Remaining pairs were then de-duplicated and used for building contact matrices. Contact matrices were normalized by iterative correction and eigenvector decomposition (ICE).

Compartments were identified by CscoreTool85 on 500 kb-resolution contact matrices. Loop domains were identified using the method proposed by Rao et al. 86. Loops were annotated by HiCCUPS with default parameters (-r 5000, 10000, 25000 –f 0.1). Differential loops were identified by diffloop87. CTCF-related loops were defined as loops with CTCF ChIP-seq peaks located on both anchors. Aggregate peak analysis about loops were performed by GENOVA88. The correlations of contact matrices were analyzed by HiCRep89.

Definition of regulatory regions

Enhancers were defined as H3K27ac peaks that did not overlap with a promoter. Insulators were defined according to reference53, and insulators were identified as the subset of CTCF ChIP-seq peaks that overlapped SMC1 ChIA-PET anchors.

Genomic feature analysis

BEDTools90 was used to perform analysis about genomic intervals, including intersecting, expanding, flanking and randomly shuffling intervals.

Gene ontology analysis

Gene symbols were first converted to EntrzID with R package BiomaRt (version 2.42.0)91. Gene ontology analysis is based on reference92.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Latest Intelligence