Single-cell transcriptome analysis reveals CD34 as a marker of human sinoatrial node pacemaker cardiomyocytes

Single-cell transcriptomic analysis of the human fetal SAN

To assess the molecular heterogeneity of the cells in the developing human SAN, we dissected the SAN tissue of a fetal heart (gestation week 19) and performed snRNA-seq (Fig. 1a). Unsupervised clustering identified 19 cell clusters representing cell types expected to be present in the heart, including cardiomyocytes as well as non-cardiomyocytes such as fibroblasts, smooth muscle, endothelial, endocardial, epicardial cells, neurons, and macrophages9,19 (Fig. 1b, Supplementary Table 1). To obtain a better resolution of the cardiomyocyte subtypes, we performed subcluster analysis on the cardiomyocyte cell clusters. This resulted in eight cell clusters that expressed high levels of TNNT2, indicative of cardiomyocytes (Fig. 1b, c). To identify the cardiomyocytes representing SAN pacemaker cells, we analyzed the expression of established SAN pacemaker genes (TBX3, TBX18, SHOX2, ISL1, HCN4)2,3,4,5. We detected one cell cluster expressing high levels of all these genes that we annotated as Core SAN cells. We identified a second cell cluster expressing these SAN genes but with lower expression levels of TBX3 and ISL1. This expression pattern is consistent with a Sinus Venosus cardiomyocyte phenotype, a cardiomyocyte population developing together with the Core SAN from the TBX18+ mesenchyme at the venous pole of the heart2,20,21,22. Of note, most of the Core SAN and Sinus Venosus cells did not express the cardiac transcription factor NKX2–5, which is a well-established characteristic of SAN pacemaker cells and the Sinus Venosus2,3. In addition, three cell clusters representing Atrial cells were identified based on the expression of NKX2–5, the atrial marker NPPA, and SCN5A4,19. As expected, we also identified a cluster of cells expressing both atrial and pacemaker genes, representing the myocytes of the Transition Zone between the pacemaker and atrial tissues6,7,9. Finally, two clusters of proliferating MKI67-expressing cells were detected (Supplementary Fig. 1 a-c). Differential gene expression analysis showed that the Transition Zone cells were marked by the expression of LRRC4C (netrin-G1 ligand), MYO16 (myosin 16), and SLC24A3 (sodium/potassium/calcium exchanger 3), while Sinus Venosus cells were marked by the expression of the long non-coding RNA LINC01880, PDEA1 (phosphodiesterase 1A), and CDH13 (cadherin 13) (Fig. 1d, Supplementary Data 1). We next focused our analysis on the Core SAN cells to identify markers of this population. The Core SAN cells expressed the highest levels of the established pan-pacemaker marker TBX32. In addition, the expression of other well-known pacemaker genes (TBX18, SHOX2, ISL1, RGS6, VSNL1, CACNA1D, HCN1, HCN4) was significantly enriched in this cell cluster, confirming a pacemaker phenotype4,8,9,22. Gene ontology (GO) analysis for biological processes of the genes enriched in the Core SAN cells resulted in terms including regulation of heart rate, cardiac pacemaker cell differentiation, sinoatrial node development, and regulation of SAN cell action potential, further supporting the SAN pacemaker phenotype of these cells (Fig. 1e). Importantly, the top ten differentially expressed genes (DEGs) within the Core SAN cluster contained several genes not previously reported to be involved in SAN pacemaker cell function or development, including TENM2 (teneurin transmembrane protein 2), PPFIA2 (protein tyrosine phosphatase receptor type F polypeptide interacting protein alpha 2), CNTN4/5 (contactin 4/5) PRKG1 (protein kinase cGMP-dependent 1), SLIT2 (slit guidance ligand 2), DGKB (diacylglycerol kinase beta), and KCNIP4 (potassium voltage-gated channel interacting protein 4) (Fig. 1 d, f). Furthermore, the guanylate kinase-associated protein DLGAP1, recently described as a SAN-specific gene in the mouse, was also contained in our top ten gene list10.

Fig. 1: Single-nuclei RNA sequencing of fetal SAN tissue identifies markers of Core SAN pacemaker cells.
figure 1

a Schematic overview of SAN tissue dissection and processing for snRNA-seq on the 10x Genomics Chromium platform. b Uniform manifold approximation and projection (UMAP) of gestation week 19 fetal heart SAN tissue sample showing 19 cell clusters (left). Subclustering of the TNNT2+ cardiomyocytes showing 8 sub-clusters (right). c UMAPs of the subclustered cardiomyocytes showing the expression of the indicated genes. d Heatmap of the top 5 differentially expressed genes (DEGs) within the indicated cardiomyocyte subclusters. e Gene Ontology (GO) analysis (biological processes) of all genes enriched in the Core SAN cluster. f UMAPs showing the top 10 DEGs of the core SAN cluster. Schematics in (a) were generated using Biorender (https://biorender.com).

To get a better understanding of the potential functional differences between the three pacemaker cell clusters we compared their expression of key cardiac ion channels (Supplementary Fig. 1d and Supplementary Data 2, 3). For most of the ion channels, expression was comparable between Core SAN and Sinus Venosus cardiomyocytes except for the working cardiomyocyte genes SCN5A, CACNA1C, GJA5, ITPR1, and RYR2 that were higher expressed by Sinus Venosus cells. Comparison between Core SAN and Transition Zone myocytes showed lower expression of pacemaker ion channels such as HCN1, HCN4, and CACNA1D and higher expression working cardiomyocyte genes such as SCN5A, CACNA1C, GJA5, GJA1 in Transition Zone cells. This agrees with the notion that Transition Zone cells have a mixed phenotype between atrial and pacemaker cells.

SANLPCs transcriptionally resemble fetal SAN cells

We next wanted to compare the expression profiles of the fetal SAN cells to hPSC-derived SANLPCs generated using our previously reported protocol14 (Fig. 2a). This protocol generates a mixed population, containing both NKX2–5 SANLPCs (50%) and NKX2–5+ cardiomyocytes (30%), that we used for scRNA-seq analysis. As shown in Fig. 2b, 11 cell clusters were identified, including fibroblasts, epithelial, and epicardial cells, as well as cardiomyocytes. To specifically compare the expression between the cardiomyocyte subtypes, we subclustered the TNNT2+ cardiomyocytes. Analysis of the expression of established SAN pacemaker markers TBX3, TBX18, SHOX2, ISL1, and HCN42,3,4,5 identified a Core SAN cluster that was mostly negative for NKX2–5 expression (Fig. 2c). Like in the fetal SAN tissue, a second NKX2–5 cardiomyocyte cluster was identified that expressed all the pacemaker markers, but lower levels of TBX3, that we annotated as Sinus Venosus cells. Analysis of expression of the atrial marker NPPA together with working cardiomyocyte markers NKX2–5 and SCN5A identified a cluster containing Atrial cardiomyocytes. The hPSC cultures also contained Transition Zone cells that expressed both pacemaker and atrial genes. In addition, two clusters of proliferating MKI67-expressing cardiomyocytes were identified (Supplementary Fig. 2a–c). To further confirm the identity of the hPSC-derived cardiomyocyte clusters, we scored the cells in the hPSC-derived dataset with the DEG lists of the fetal tissue dataset23 (Fig. 2d). Applying the top 200 DEGs from the fetal Core SAN cluster identified the Core SAN cluster in the hPSC dataset. Using the same approach, the identities of the Sinus Venosus, Transition Zone, and Atrial cell clusters were confirmed. These findings indicate that the NKX2–5+ cells within the hPSC-derived population represent Atrial and Transition Zone cells while the NKX2–5 cells represent Core SAN and Sinus Venosus cells.

Fig. 2: Single-cell RNA sequencing of hPSC-derived SANLPCs reveals transcriptomic similarities to fetal SAN pacemaker cells.
figure 2

a Schematic overview of SANLPC differentiation protocol and sample processing for scRNA-seq on the 10x Genomics Chromium platform. b UMAP of day 25 HES2-derived SANLPCs showing 11 cell clusters (left). Subclustering of the TNNT2+ cardiomyocytes showing 6 sub-clusters (right). c UMAPs of the subclustered cardiomyocytes showing the expression of the indicated genes. d UMAPs showing signature score distribution for the DEGs of the indicated fetal SAN cell types. e Heatmap of the top 5 DEGs within the indicated cardiomyocyte subclusters. f GO analysis (biological processes) of all genes enriched in the Core SAN cluster. g UMAPs showing the top 10 DEGs of the core SAN cluster. h UMAPs showing Harmony integration of the fetal SAN snRNA-seq data and the hPSC scRNA-seq data labeled by source dataset (left), cluster number (center), and assigned cell types (right). i Stacked bar graph showing the frequency of each cell type from both fetal and hPSC datasets in the indicated cell clusters. j UMAPs of integrated fetal SAN and hPSC datasets showing the expression of the indicated genes. k Spearman correlation between selected clusters from the fetal SAN and hPSC datasets. p < 0.05 for all correlations (asymptotic t approximation). Schematics in (a) were generated using Biorender (https://biorender.com).

Differential gene expression analysis identified expression patterns that distinguish the different hPSC-derived cardiomyocyte subtypes. Transition Zone cells were marked by the expression of ANKRD1 (ankyrin repeat domain 1, cardiac muscle), CSRP3 (cysteine and glycine-rich protein 3, cardiac LIM protein), and MYL3 (myosin light chain 3), while Sinus Venosus cells were marked by the expression of TRH (thyrotropin-releasing hormone), CKM (creatine kinase, muscle) and DES (desmin) (Fig. 2e, Supplementary Data 4). The Core SAN cells were enriched in the expression of previously established conduction system and SAN pacemaker markers (TBX3, VSNL1, LBH, CPNE5)4,8,9,24. GO term analysis identified an enrichment of genes involved in regulation of heart contraction, cardiac conduction system development, and sinoatrial node development, further supporting the SAN pacemaker phenotype of the cells contained in the Core SAN cluster (Fig. 2f). Interestingly, the top ten DEGs of the Core SAN cells contained genes not previously reported to be specifically expressed in SAN pacemaker cardiomyocytes including ACTA2 (actin alpha 2, smooth muscle), ALDH1A1 (aldehyde dehydrogenase 1a1), HAMP (hepcidin antimicrobial peptide), MYLK (myosin light chain kinase, smooth muscle), and ALDOC (aldolase c, brain-type) (Fig. 2g). The list also contained the C-X-C motif chemokine ligand CXCL12, the smooth muscle myosin heavy chain isoform MYH11, the BMP signaling ligand BMP4, and the annexin family member ANXA2 which have been previously detected in transcriptomic analysis of mouse and human SAN cells4,8,12,22.

To further compare the hPSC-derived cells and the fetal SAN cells, we used Harmony integration and combined the two datasets (Fig. 2h–j, Supplementary Fig. 2d–g)25. This analysis showed that the hPSC-derived cardiomyocytes clustered together with the fetal cardiomyocytes. Within the integrated cardiomyocytes we identified clusters of Core SAN, Sinus Venosus, Transition Zone, and Atrial cardiomyocytes as described in the individual datasets. Importantly, the Core SAN cluster contained both hPSC-derived and fetal tissue-derived Core SAN cells (Fig. 2i, Supplementary Fig. 2e). The vast majority of the Core SAN cells from the hPSC and fetal datasets (74% and 94%) clustered together in this merged Core SAN cluster. The Sinus Venosus, Transition Zone, and Atrial clusters similarly contained both hPSC-derived and fetal-derived cells of the respective cardiomyocyte subtype. However, along with the matching subtypes, a more heterogenous contribution of other cardiomyocyte subtypes was detected in these clusters. Spearman correlation analysis further demonstrated that the hPSC-derived Core SAN cells are most similar to the cells of the fetal Core SAN (Fig. 2k, Supplementary Fig. 2g). Taken together, this molecular analysis indicates that the hPSC-derived Core SAN pacemaker cells closely resemble the Core SAN pacemaker cells found in the developing human heart.

Identification of shared Core SAN pacemaker markers

To identify a list of marker genes that can be used to identify SAN pacemaker cells in vitro and in vivo, we next compared the DEGs of the Core SAN cells from the hPSC and fetal datasets. This analysis identified 36 genes that were significantly enriched in both populations (Fig. 3a–d, Supplementary Fig. 3a–c). Signature scoring analysis showed that these 36 genes clearly identify the Core SAN cell clusters in the hPSC and fetal data. The gene list contains the established cardiac conduction system and SAN markers TBX3, TBX5, VSNL1, LBH, CPNE2,4,8,9,24 as well as GNAO, a subunit of the G-protein signal-transducing complex, that has been recently described to be expressed in hPSC-derived SAN-like cells22. In addition, this shared list contains a number of genes that have not been previously associated with human SAN function or development including; (i) contractile apparatus and cytoskeletal genes: MYH11 (myosin heavy chain 11), ACTB (actin beta), WIPF3 (WAS/WASL interacting protein family member 3); (ii) genes involved in BMP signaling: BMP4 (bone morphogenetic protein 4), BMP5 (bone morphogenetic protein 5), ID3 (inhibitor of DNA binding 3); (iii) genes encoding for calcium binding proteins involved in cellular signaling: ANXA2 (annexin 2), NECAB1 (N-terminal EF-hand calcium binding protein 1), SPOCK1 (proteoglycan 1); (iv) genes encoding for regulators of transmembrane calcium ion-currents: RRAD (Ras related glycolysis inhibitor and calcium channel regulator), PRKG1 (protein kinase cGMP dependent 1); (v) and genes classically known to be expressed in neurons and glia cells that play a role in axon growth and guidance: BASP1 (brain abundant membrane attached signal protein 1), PPFIA2 (protein tyrosine phosphatase receptor type F polypeptide interacting protein alpha 2), EFR3B (EFR3 homolog B), SLIT2 (slit guidance ligand 2), LYPD6 (LY6/PLAUR domain containing 6). Importantly, the list also contains five genes encoding membrane-spanning proteins that could serve as SAN pacemaker cell surface markers that have not been previously associated with SAN cells: CD34 (CD34 antigen), CADM1 (cell adhesion molecule 1), EFNB2 (ephrin B2), NTRK2 (neurotrophic receptor tyrosine kinase 2), and ELAPOR2 (endosome-lysosome associated apoptosis and autophagy regulator family member 2).

Fig. 3: Comparison of fetal and hPSC-derived SAN cells identifies shared Core SAN markers.
figure 3

a Venn diagram of the Core SAN markers identified in the fetal and hPSC datasets. b UMAPs of TNNT2+ cardiomyocytes showing the assigned cell types (top row) and the signature score distribution (bottom row) of the shared 36 Core SAN marker genes in the fetal (left) and hPSC datasets (right). c, d Dot plots showing the expression of the conserved core SAN markers in the fetal (c) and hPSC (d) datasets. *indicates genes encoding for membrane-spanning proteins. eh Immunofluorescent staining of day 25 hPSC-derived SANLPCs, atrial-like cardiomyocytes (ALCMs), and ventricular-like cardiomyocytes (VLCMs) for NKX2–5 and SAN pacemaker transcription factor SHOX2, ventricular contractile apparatus protein MLC2V and pacemaker ion channel HCN4, NKX2–5:GFP and atrial protein NPPA (e), Core SAN marker MYH11 (f), Core SAN marker BMP4 (g), and Core SAN marker CD34 (h). Cells were counterstained with cTNT to identify cardiomyocytes and Hoechst to visualize all cells (n = 3) independent differentiations. Images were denoised using nikons denoise.ai. Scale bars, 100 μm.

To assess whether the 36 Core SAN marker genes that we identified in fetal/hPSC-derived SAN pacemaker cells are maintained during maturation, we analyzed their expression in the recently published snRNA-seq data of the adult human SAN from Kanemaru et al.12. Subclustering of the SAN cell cluster identified Atrial and Transition Zone cardiomyocytes similar to our fetal dataset. (Supplementary Fig. 3d–f). In addition, two Core SAN cell clusters were identified, based on the expression of high levels of TBX3, SHOX2, HCN4 and very low levels of NKX2–5. These adult Core SAN cells expressed significantly higher levels of 22 of the 36 Core SAN genes compared to adult Atrial cardiomyocytes, suggesting that the majority of the Core SAN markers are maintained during maturation (Supplementary Fig 3g–i, and Supplementary Data 5).

To determine if the 36 Core SAN markers are conserved between species, we compared our list to the previously published scRNA-seq data of the embryonic (E16.5) mouse SAN from Goodyer et al.9,26 (Supplementary Fig. 4a-c). For the mouse dataset, we used the same cluster annotation of SAN cells, Transition zone cells, and Atrial cells as originally published by the authors. As expected, the established SAN markers TBX3, TBX5, VSNL1, LBH, and CPNE5 were conserved between human and mouse. However, only half of the human Core SAN genes from our list were specifically expressed in mouse SAN pacemaker cells. The other half was higher expressed in atrial cardiomyocytes of the mouse or expressed at very low levels. These observations prompted us to assess whether the SAN-specific genes identified in the mouse SAN scRNA-seq dataset are conserved in the human SAN9. Out of the 50 genes that we analyzed, the majority (56%) were either not detected in the human Core SAN cells or expressed at higher levels in human atrial cardiomyocytes. Notably, the list of genes also contained mouse SAN pacemaker cell surface markers26, and only 13 out of 25 were conserved in the human SAN. Taken together, this analysis shows that, while key markers are conserved between species, there are numerous markers that are specific to the human SAN. This highlights the importance of analyzing human cells and tissues.

MYH11, BMP4, and CD34 are SAN pacemaker markers

We selected three Core SAN genes identified in our study to further validate them as SAN markers. We chose the top two differentially expressed genes MYH11 and BMP4, which have previously been detected in transcriptomic studies of SAN pacemaker cells but have not been further validated on the protein level4,8,12,22. In addition, we chose the surface antigen CD34 due to its promising potential application as a SAN cell surface marker and the fact that its expression is maintained in the adult SAN (Supplementary Fig 3h). We first focused on hPSC-derived cardiomyocytes and used previously reported protocols to generate SANLPCs, atrial-like cardiomyocytes (ALCMs), and ventricular-like cardiomyocytes (VLCMs)14,27. RT-qPCR analysis and immunofluorescence staining confirmed the phenotype of the different cardiomyocyte subtypes. As expected, SANLPCs expressed SHOX2 and HCN4, ALCMs expressed NKX2–5 and NPPA, and VLCMs expressed NKX2–5 and MLC2V (Fig. 3e, Supplementary Fig. 5a, b). Importantly, immunostaining for MYH11, BMP4, and CD34 showed that these proteins are specifically expressed by SANLPCs but not ALCMs and VLCMs (Fig. 3f-h). Of note, a nuclear staining for BMP4 was detected in the VLCMs. This is most likely an unspecific signal as RT-qPCRs confirmed significantly lower BMP4 expression in VLCMs compared to SANLPCs (Supplementary Fig. 5c).

To validate the expression of these SAN markers in primary tissue, we dissected the SAN of fetal hearts (gestation week 17–20) and prepared longitudinal sections for immunostaining (Fig. 4a, Supplementary Fig. 6a)28. We used the expression of TBX3, SHOX2, and HCN4 to identify fetal Core SAN pacemaker cells, expression of NPPA to identify Atrial cardiomyocytes, and expression of cTNT to distinguish cardiomyocytes from non-cardiomyocytes (Fig. 4a). Using these markers we also detected TBX3SHOX2+HCN4+NPPA+ cardiomyocytes adjacent to the Core SAN representing Transition Zone cardiomyocytes (Supplementary Fig. 6a, cI & 7a, bI-II). Furthermore, a region of TBX3lowSHOX2+HCN4+NPPA cardiomyocytes was detected cranial to the SAN. These cardiomyocytes did not express NKX2–5 suggesting that they represent the Sinus Venosus cardiomyocytes. Immunostaining for MYH11 labeled the SAN pacemaker cardiomyocytes but did not label Atrial, Transition Zone, or Sinus Venosus cardiomyocytes (Fig. 4bI, Supplementary Fig. 6cII & 7bIII). As expected, MYH11 was also detected in the smooth muscle cells of blood vessels contained in the sections. BMP4 expression was detected in HCN4+cTNT+ SAN pacemaker cells and in some of the HCN4cTNT non-cardiomyocytes of the SAN (Fig. 4bII, Supplementary Fig. 6cIII). BMP4 expression was higher in the SAN cardiomyocytes than in the adjacent Atrial and Transition Zone cardiomyocytes. CD34 specifically labeled cTNT+ SAN cardiomyocytes but not Atrial, Transition Zone, or Sinus Venosus cardiomyocytes (Fig. 4bIII, c, Supplementary Fig. 7bIII, IV). In agreement with our transcriptional analysis most of the SAN cardiomyocytes were CD34+NKX2–5 (white arrowheads, Fig. 4cIV). In addition, we detected a few CD34+NKX2–5+ (green arrowheads) and CD34NKX2–5 (yellow arrowheads) pacemaker cardiomyocytes in the SAN. Of note, CD34 is also expressed by endothelial cells, which most likely explains the detection of some CD34+ cells in the atrial tissue. Co-staining for CD31 and cTNT confirmed that these CD34+ cells are indeed CD31+ endothelial cells and that only SAN cardiomyocytes but not atrial cardiomyocytes stain positive for CD34 (Supplementary Fig. 6b).

Fig. 4: MYH11, BMP4, and CD34 are specifically expressed in human SAN pacemaker cardiomyocytes.
figure 4

Immunofluorescent staining of gestation week 19 fetal human SAN for: (a) pacemaker transcription factor TBX3 and pacemaker ion channel HCN4 (aI), SAN pacemaker transcription factor SHOX2 and atrial protein NPPA (aII); (b) core SAN marker MYH11 and HCN4 (bI), Core SAN marker BMP4 and HCN4 (bII), Core SAN marker CD34 and HCN4 (bIII); (c) CD34 and SHOX2 (cI), CD34 and TBX3 (cII), CD34 and MYH11 (cIII), and CD34 and NKX2–5 (cIV). White dashed line outlines the SAN. Yellow and green boxes indicate location of high magnification insets shown on the right marked with *. White arrows in bI* indicate MYH11+ smooth muscle cells of a blood vessel. Arrowheads in cIV* indicate the following cell types: white, CD34+NKX2–5; green, CD34+NKX2–5+; yellow, CD34NKX2–5. Tissue sections were counterstained with cTNT to identify cardiomyocytes and DAPI to visualize all cells (n = 3 independent stains each of sections from 3 heart samples). Images were denoised using nikons denoise.ai. Scale bars, 500 μm (left) and 50 μm in the high magnification insets (right). RA right atrium.

Our computational analysis suggested that CD34 is specific to human SAN pacemaker cells, because its expression was not detected in mouse SAN pacemaker cells (Supplementary Fig. 4b)9. To further validate this, we performed immunostaining for CD34 on sections of mouse SAN tissue (postnatal days 0–3). These experiments confirmed that SHOX2+TBX3+HCN4+ mouse SAN cardiomyocytes do not express CD34 (Supplementary Fig. 8a, b).

To determine if expression of MYH11, BMP4, and CD34 is specific to the primary pacemaker, the SAN, or whether these proteins are also expressed by the secondary pacemaker, the atrioventricular node (AVN), we analyzed sections of the atrioventricular junctional region of fetal hearts (gestation week 17–20) (Supplementary Fig. 9a). AVN pacemaker cardiomyocytes were identified based on the expression of the AVN markers TBX3, MSX2, and the lack of expression of the SAN marker SHOX2 (Fig. 5aI, IV, Supplementary Fig. 9bI). NPPA and MLC2V expression identified adjacent atrial and ventricular tissues, respectively (Supplementary Fig. 9bI, II). Neither MYH11 nor BMP4 were expressed in AVN cardiomyocytes (Fig. 5aII, III). Similarly, no CD34+cTNT+ cells were found in AVN tissue (Fig. 5aIV, Supplementary Fig. 9bIII). The CD34+cTNT cells that were detected likely represent a CD34+ mesenchymal/fibroblast cell population that has previously been reported in multiple studies19,29,30. Taken together, these findings show that MYH11, BMP4, and CD34 are specifically expressed by the primary pacemaker cardiomyocytes of the SAN, but not by atrial, ventricular, nor AVN cardiomyocytes.

Fig. 5: MYH11, BMP4, and CD34 are not expressed in human AVN cardiomyocytes.
figure 5

Immunofluorescent staining of gestation week 17 fetal human AVN for: AVN transcription factor MSX2 and HCN4 (aI), Core SAN marker MYH11 and HCN4 (aII), Core SAN marker BMP4 and HCN4 (aIII), and Core SAN marker CD34 and TBX3 (aIV). Images represent consecutive sections of AVN tissue. White dashed line outlines the AVN. Yellow box indicates location of high magnification insets shown on the right marked with *. Tissue sections were counterstained with cTNT to identify cardiomyocytes and DAPI to visualize all cells (n = 3 independent stains each of sections from 1 heart sample). Images were denoised using nikons denoise.ai. Scale bars, 500 μm (left) and 50 μm in the high magnification insets (right). IVS interventricular septum; RA right atrium.

Isolation of SANLPCs based on CD34 expression

To determine if CD34 could be used for the isolation of SANLPCs from hPSC differentiation cultures, we first carried out flow cytometric analysis of CD34 expression. The HES3 NKX2–5egfp/w reporter line was used for these experiments which allowed us to identify NKX2–5:GFP cardiomyocytes as SANLPCs and NKX2–5+ cardiomyocytes as VLCMs or ALCMs14,31. Analysis within the fraction of SIRPA+CD90 cardiomyocytes32 at day 25 of differentiation showed that only SANLPCs are marked by CD34 while VLCMs and ALCMs do not express CD34 (Fig. 6a, b, Supplementary Fig. 10a). Within the SANLPC differentiation cultures, the majority of the NKX2–5 cells expressed CD34 (54 ± 4%). Notably, a small fraction of NKX2–5+ cells contained in the SANLPC cultures also expressed CD34 (17 ± 2%), an observation consistent with our findings in the fetal heart tissue sections. To identify the earliest timepoint of CD34 expression in SANLPC progenitors, we performed a time course analysis (Fig. 6c, d). The first SIPRA+CD90 cardiomyocytes detected at day 8 of differentiation did not express CD34. By day 10, a small population of NKX2–5 cardiomyocytes started to express CD34, and by day 16, most of the NKX2–5 cardiomyocytes expressed CD34. The proportion of CD34+NKX2–5 cardiomyocytes continued to increase until day 50 at which point 81 ± 4% of NKX2–5 cardiomyocytes expressed CD34 (Supplementary Fig. 10b, c). The first NKX2–5+CD34+ cardiomyocytes were detected at day 16. The proportion of NKX2–5+CD34+ cardiomyocytes also increased over time, but expression levels of CD34 remained lower than in the NKX2–5 cardiomyocytes at all time points analyzed. Accordingly, analysis at day 50 showed that 64 ± 6% of NKX2–5 cells and 20 ± 3% of NKX2–5+ cells expressed high levels of CD34 (CD34++), similar to the proportions of CD34+ cells observed at day 25. Importantly, day 50 VLCM and ALCM populations remained CD34 negative, confirming that CD34 is exclusively expressed on SAN pacemaker cardiomyocytes (Supplementary Fig. 10d, e).

Fig. 6: FACS and MACS sorting for CD34+ cells enriches for SANLPCs.
figure 6

a Flow cytometric analyses at day 25 of CD34 and NKX2–5:GFP expression in SIRPA+CD90 cardiomyocytes in VLCMs, ALCMs, and SANLPCs. b Bar graphs summarizing the expression of CD34 in myocytes as shown in (a) in the indicated differentiation cultures (left) and within the NKX2–5 and NKX2–5+ fraction of SANLPC differentiation cultures (right) (n = 6 independent differentiations). c Flow cytometric analyses of CD34 and NKX2–5:GFP expression in SIRPA+CD90 cardiomyocytes at indicated time points throughout the differentiation. d Bar graph summarizing the expression of CD34 in myocytes as shown in (c) (n = 6 independent differentiations). e Flow cytometric analyses of CD34 and NKX2–5:GFP expression in SIRPA+CD90 cardiomyocytes before and after FACS at day 25. Teal shading indicates CD34+ and blue shading indicates CD34 FACS sorting gates. f Bar graphs summarizing the proportion of NKX2–5 and NKX2–5+ cells in presort, CD34+, and CD34 FACS sorted samples (n = 7 independent differentiations). g RT-qPCR analysis of the expression of the indicated genes in presort, CD34+, and CD34 FACS sorted samples (n = 7 independent differentiations, n = 4 independent differentiations (genes BMP4)). Values represent expression relative to the housekeeping gene TBP. h Flow cytometric analyses of CD34 and NKX2–5:GFP expression in SIRPA+CD90 cardiomyocytes before and after MACS at day 25. Teal shading indicates CD34+ and blue shading indicates CD34 MACS sorting gates. i Bar graphs summarizing the proportion of NKX2–5 and NKX2–5+ cells in presort, CD34+, and CD34 MACS sorted samples (n = 8 independent differentiations). j RT-qPCR analysis of the expression of the indicated genes in presort, CD34+, and CD34 MACS sorted samples (n = 6 independent differentiations, n = 5 independent differentiation (MYH11, BMP4)). Values represent expression relative to the housekeeping gene TBP. Statistical analysis was performed using two-sided paired t-test when comparing two samples (b, f, i) and one-way ANOVA followed by Bonferroni’s post hoc test when comparing >2 samples (b, g, j). Datasets that failed the normality test ((g): SCN5A, (j): SCN5A) were analyzed using Friedman test with Dunn’s post hoc test: *p < 0.05, **p < 0.01, ***p < 0.001 vs indicated sample. Error bars represent SEM. Violin plot elements: center line, median; lower line, first quartile; upper line, third quartile. Source data are provided as a Source Data file.

To test if it is possible to enrich for NKX2–5 SANLPCs based on CD34 expression, we isolated SIRPA+CD90CD34+ and SIRPA+CD90CD34 cardiomyocytes by fluorescence-activated cell sorting (FACS) and analyzed the populations for NKX2–5:GFP expression (Fig. 6e, f). We performed these experiments at day 25 of differentiation because the majority of NKX2–5 cells expressed CD34 while the majority of NKX2–5+ cells were still CD34 negative at this timepoint. As shown in Fig. 6e and f, the CD34+ sorted population was significantly enriched for NKX2–5 cells and significantly depleted of NKX2–5+ cells compared to pre-sort. The opposite trend was observed for the CD34 sorted population. RT-qPCR analysis showed significantly enriched expression of SAN genes (TBX3, SHOX2, HCN4, MYH11, BMP4) and decreased expression of atrial/working cardiomyocyte genes (SCN5A, NPPA) in the CD34+ sorted cells compared to pre-sort and CD34 sorted cells (Fig. 6g). To determine if the CD34+NKX2–5 phenotype is stable over time, we cultured the cells for an additional 25 days. Flow cytometric analyses at days 1, 10, and 25 post-sort revealed that the cells maintained CD34 expression and did not upregulate NKX2–5 expression (Supplementary Fig. 10f, g). These experiments also showed that a sizeable proportion of CD34 cells upregulated CD34 expression by day 50. To further assess the fate of the NKX2–5CD34 cells specifically, we isolated this population on day 25 and analyzed their phenotype on day 50 (Supplementary Fig 11a, b). 35 ± 2% of these NKX2–5CD34 cells did upregulate CD34 expression by day 50 and showed higher expression of TBX3 compared to the day 25 and day 50 NKX2–5CD34 cells. Taken together, our data suggests that the day 25 NKX2–5CD34 cells represent a progenitor population and that the majority of NKX2–5 SANLPCs do express CD34 upon extended culture.

For disease modeling and cell therapy applications, large cell numbers of SAN pacemaker cells will be required, thus prompting us to test whether magnetic-activated cell sorting (MACS) for CD34+ cells can also enrich for SANLPCs. Interestingly, MACS for CD34+ cells enriched for SIRPA+CD90 cardiomyocytes without the need for additional antibodies (Fig. 6h-j, Supplementary Fig. 11c). Notably, the CD34+ sorted cells were also significantly enriched in NKX2–5 cardiomyocytes and depleted of NKX2–5+ cells compared to pre-sort. Accordingly, RT-qPCR analysis showed increased expression of SAN pacemaker genes in the MACS-sorted CD34+ cells and decreased expression of working cardiomyocyte genes compared to pre-sort and CD34 sorted cells (Fig. 6j). The CD34+ population contained a low frequency (<5%) of CD31+ endothelial cells that are known to contaminate hPSC-derived cardiomyocyte differentiations (Supplementary Fig. 11d, e). These cells were no longer detectable following 5 days of culture, likely because our culture conditions did not support their survival.

To confirm that this CD34-based isolation of NX2–5 SANLPCs can be applied to other hPSC lines, we repeated the experiments using the HES2 hESC line (Supplementary Fig. 12a–f). First, we confirmed that CD34 is specifically expressed by HES2-derived SANLPCs but not by ALCMs and VLCMs. Next, we showed that MACS sorting for CD34+ cells results in a significant enrichment of NKX2–5 cardiomyocytes and depletion of NKX2–5+ cardiomyocytes. In addition, CD34+ sorted cells showed higher expression of SAN pacemaker genes and reduced expression of working cardiomyocyte genes, thus confirming the enrichment for SANLPCs. Taken together, these experiments demonstrate that CD34 can be universally used to identify NKX2–5 SANLPCs and isolate them from cardiac hPSC-differentiation cultures.

Previously, CD166 has been reported to label SAN progenitors in the mouse33. We therefore analyzed the expression of CD166 during SANLPC and VLCM differentiations. We found that both SANLPCs and VLCMs do express CD166 between days 8–25, with higher proportions of CD166+ cells detected in VLCMs (Supplementary Fig. 13a–c). This is in agreement with previous reports showing that CD166 is expressed by hPSC-derived cardiomyocytes34. Furthermore, only a small fraction of CD166+ cardiomyocytes co-expressed CD34 (12 ± 2%) and the majority of CD34+ SANLPCs did not express CD166. Taken together this data suggests that CD166 does not specifically mark human SAN pacemaker cells.

CD34+NKX2–5+ and CD34NKX2–5 cells are SAN pacemaker cells

In addition to the CD34+NKX2–5 SAN pacemaker cells, we detected some CD34+NKX2–5+ and CD34NKX2–5 cardiomyocytes in the fetal SAN (Fig. 4cIV) and hPSC-derived SANLPC cultures (Fig. 6a, b). Furthermore, sorting for CD34+ cells also isolated a small proportion of CD34+NKX2–5+ cells (Fig. 6e–i). To characterize the identity of both of these cardiomyocyte populations in more detail, we performed Cellular Indexing of Transcriptomes and Epitopes by sequencing (CITE-seq) on day 25 of SANLPC differentiation using a barcoded CD34 antibody (Supplementary Fig. 14a, b)35. This enabled us to read out protein-level CD34 expression in parallel to the whole transcriptome of the cells. Unsupervised clustering identified 11 cell clusters with the majority representing cardiomyocytes. Small clusters of fibroblasts, epithelial cells, and epicardial cells were also detected with a similar distribution as in our original hPSC dataset (Fig. 7a). Subclustering of TNNT2+ cardiomyocytes identified the same cardiomyocyte subtypes as before, including NKX2–5TBX3high Core SAN cells, NKX2–5TBX3mid Sinus Venosus cells, NKX2–5+NPPA+ Atrial cells, and Transition Zone cells that express both pacemaker and atrial genes (Fig. 7a, b, Supplementary Data 6). Signature scoring analysis using the top 200 DEGs of the fetal SAN dataset as a reference confirmed the identity of these cardiomyocyte subclusters (Fig. 7c).

Fig. 7: CITE-seq of hPSC-derived SANLPCs shows that CD34+NKX2–5+ and CD34NKX2–5 cardiomyocytes are transcriptionally similar to SAN pacemaker cells.
figure 7

a UMAP of the day 25 HES2-derived SANLPC CITE-seq dataset showing 11 cell clusters (left). Subclustering of the TNNT2+ cardiomyocytes showing 7 sub-clusters (right). b UMAPs of the subclustered cardiomyocytes showing the expression of the indicated genes. c UMAPs showing signature score distribution for the DEGs of the indicated fetal SAN cell types. d UMAPs showing the expression of CD34 on the transcript (RNA) or protein level (antibody-derived tag (ADT)) (left) and co-expression with NKX2–5 transcript (right). e Stacked bar graph quantifying the number of cells expressing CD34 by transcript (RNA) or protein (ADT) grouped based on NKX2–5 expression by transcript. f UMAPs showing the distribution of the selected cell populations indicated above. g Box plots showing the expression of the indicated genes on the transcript level and CD34 on the protein level (ADT) in the selected cell populations indicated below (CD34+NKX2–5+ n = 78; CD34+NKX2–5 n = 1465; CD34NKX2–5 n = 2294; Atrial CD34NKX2–5+ n = _97 single cells). h Spearman correlation between the selected cell populations indicated on the axes. p < 0.05 for all correlations (asymptotic t approximation). Blue labels indicate the data that is showing protein level (ADT)-based CD34 expression. Box plot elements: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range. Source data are provided as a Source Data file.

We next analyzed the expression of CD34 at the transcript (RNA) and protein (antibody-derived tag, ADT) levels. A larger number of CD34+ cells was detected by ADT counts compared to transcript counts (1552 vs. 387) (Fig. 7d, e). Accordingly, we also identified a larger number of CD34+NKX2–5+ cells (85 vs. 24) when using the antibody-based detection of CD34+ cells. To assess the identity of the different cell populations, we isolated CD34+NKX2–5+, CD34NKX2–5, and CD34+NKX2–5 cells based on their expression of CD34 (ADT) and NKX2–5 (RNA) (Fig. 7f, Supplementary Fig. 14c, d). The majority of the CD34+NKX2–5 cells (62%) segregated into the Core SAN cell cluster. The distribution of NKX2–5+CD34+ and CD34NKX2–5 cells was more scattered with about half of these cells (47% and 46% respectively) segregating into the Core SAN cell cluster.

To further characterize the CD34+NKX2–5+ and CD34NKX2–5 cells we compared the expression of typical pacemaker and working myocyte markers in these cells with the CD34+NKX2–5 SAN pacemaker cells. We included CD34NKX2–5+ myocytes from the Atrial cluster as an additional expression reference in this analysis. Both the CD34+NKX2–5+ and CD34NKX2–5 cells expressed comparable levels of the SAN pacemaker genes TBX18, SHOX2, ISL1, HCN4, MYH11 and BMP4 like the CD34+NKX2–5 cells (Fig. 7g). Lower expression of these SAN genes was detected in the Atrial myocytes. Interestingly, expression of the pan-pacemaker marker TBX3 was high in the CD34+NKX2–5+ and CD34+NKX2–5cells, but lower in the CD34NKX2–5 cells. Expression of the atrial genes NPPA, SCN5A, GJA5, and KCNA5 was comparable between the CD34+NKX2–5+, CD34NKX2–5, and CD34+NKX2–5 cells and lower than in the Atrial myocytes. In addition, we performed Spearman correlation analysis using the DEGs of each of these cell populations, which showed that the CD34+NKX2–5+ and CD34NKX2–5 cells most closely correlate with the CD34+NKX2–5 SAN pacemaker cells but not with the CD34NKX2–5+ Atrial myocytes (Fig. 7h).

Collectively, these data indicate that the CD34+NKX2–5+ and CD34NKX2–5 cells are transcriptionally comparable to the CD34+NKX2–5 SAN pacemaker cells and have a SAN pacemaker phenotype. The lower expression of TBX3 in the CD34NKX2–5 cells suggests that they might be a SAN pacemaker progenitor population. This is consistent with the observation that the CD34 sorted NKX2–5 cardiomyocytes upregulate CD34 and TBX3 expression upon extended culture (Supplementary Fig. 10f, g and Supplementary Fig. 11a, b).

CD34+ cardiomyocytes have a functional pacemaker phenotype

To assess if the hPSC-derived CD34+ cardiomyocytes also show the electrophysiological characteristics of pacemaker cells we recorded optical action potentials (Fig. 8a). SIRPA+CD90NKX2–5 FACS sorted SANLPCs, which we have previously shown to functionally represent pacemaker cells, were included as positive control14. In addition, ALCMs and VLCMs were included as reference cell types. We recorded fast spontaneous beating rates in MACS-sorted CD34+ cells and SANLPC controls that were significantly faster than beating rates of ALCMs and VLCMs14,17 (Fig. 8b). As expected for pacemaker cells, CD34+ cells and SANLPC controls displayed action potentials with significantly slower maximum upstroke velocities than ALCMs and VLCMs14,36. Additional parameters that distinguish pacemaker cells from working cardiomyocytes are their action potential durations14,27,36. Both CD34+ cells and SANLPC controls showed significantly shorter action potential durations at 30% (APD30) and 90% (APD90) repolarization compared to VLCMs. The APD30/90 ratios for CD34+ cells and SANLPC controls were significantly larger compared to ALCMs which clearly distinguishes them from ALCMs. Finally, we analyzed the diastolic depolarization during phase 4 of the action potential, which is driven by funny currents specifically present in pacemaker cells36,37. Accordingly, we observed pronounced diastolic depolarization in CD34+ cells and SANLPC controls that was significantly faster than in ALCMs and VLCMs. Of note, both CD34+ and CD34 cells, that were isolated from the SANLPC cultures, displayed pacemaker-like action potentials. This could be due to the fact that the samples were MACS sorted on day 25 when the CD34 population still contained a sizable fraction of NKX2–5 cardiomyocytes (41 ± 3%, Fig. 6h, i), in addition to the NKX2–5+ Atrial and Transition Zone cells. This is consistent with the idea that the CD34NKX2–5 cardiomyocytes represent an immature pacemaker cell population that upregulates CD34 expression upon extended culture.

Fig. 8: CD34+ sorted cells have a functional pacemaker phenotype.
figure 8

a Representative optical action potential traces of day 25 MACS sorted CD34+ and CD34 cells, FACS sorted SIRPA+CD90NKX2–5 SANLPCs, as well as ALCMs and VLCMs generated from the HES3 NKX2–5eGFP/w hPSC line. b Optical action potential parameters analyzed in the indicated cell types (MACS pre-sort, n = 46; MACS CD34+, n = 47; MACS CD34, n = 43; SANLPC, n = 57; ALCM, n = 79; VLCM, n = 51 cell aggregates from three biological replicates (independent differentiations) for each cell type). Statistical analysis was performed using Kruskal-Wallis test followed by Dunn’s post hoc test: *p < 0.05, **p < 0.01, ***p < 0.001 vs. indicated sample. c Schematic overview of the experimental design used to assess the pacemaker capacity of the CD34+ cardiomyocytes in vitro. d Representative bright field images (left) and color-coded activation maps (right) of a VLCM monolayer before and ~4 weeks after the addition of a VLCM control aggregate (n = 12 independent experiments from three biological replicates (independent differentiations)). e Representative bright field images (left) and color-coded activation maps (right) of a VLCM monolayer before and ~4 weeks after the addition of a MACS-sorted CD34+ cardiomyocyte (CM) aggregate (n = 10 independent experiments from three biological replicates (independent differentiations)). Scale bars, 1000 μm. Violin plot elements: center line, median; lower line, first quartile; upper line, third quartile. APD30/90, Action potential duration at 30%/90% repolarization. Source data are provided as a Source Data file. Schematics in (c) were generated using Biorender (https://biorender.com).

To further confirm the functional pacemaker phenotype of the CD34+ cells, we tested their capacity to pace other cardiomyocytes. For these experiments, monolayers of VLCMs were prepared (Fig. 8c). Once the VLCMs had formed an electrical syncytium, aggregates of CD34+ MACS sorted cardiomyocytes or control VLCMs were added. Using optical mapping, we observed that the CD34+ cell aggregate initiated the electrical activation and thereby paced the monolayers in 10 out of 14 of these co-cultures (Fig. 8d, e). In contrast, only 1 out of 13 VLCM aggregates was the source of electrical activation (P = 0.0013 Fisher’s exact test, CD34+ cardiomyocyte aggregates vs VLCM aggregates). In summary, our functional analysis clearly distinguished CD34+ cardiomyocytes from ALCMs and VLCMs and showed that they have a functional pacemaker phenotype comparable to NKX2–5 SANLPCs.

CD34 does not impact SANLPC differentiation and function

To assess whether CD34 is just a marker or whether it plays a role in SAN pacemaker differentiation and function, we used CRISPR-Cas9 to generate two pools of CD34 knockout (CD34 KO) hPSCs from the HES3 NKX2–5egfp/w reporter cell line (Supplementary Fig. 15a–d). To control for any potential gene editing-related effects, we generated control HES3 NKX2–5egfp/w hPSCs with edits in the AAVS1 safe harbor locus. CD34 knockout efficiency was assessed to be 96% (CD34 KO 1) and 24% (CD34 KO 2) using endothelial cell differentiations (Supplementary Fig. 15e). When we applied the SANLPC differentiation protocol, we observed a significant reduction in CD34+ myocytes for CD34 KO 1 but not CD34 KO 2, most likely due to the lower knockout efficiency of CD34 KO 2 (Fig. 9a, b, Supplementary Fig. 15f). Importantly, the proportion of NKX2–5 and NKX2–5+ cardiomyocytes was unchanged in both CD34 KO pools compared to AAVS1 Control-derived SANLPC differentiations. Further experiments were performed using the CD34 KO 1 hPSCs due to their superior knockout efficiency. RT-qPCR analysis on SIRPA+CD90NKX2–5 sorted cardiomyocytes showed that the expression of key pacemaker genes (TBX3, SHOX2, HCN4, MYH11, BMP4) and atrial genes (SCN5A, NPPA) was unchanged in CD34 KO 1 cells (Fig. 9c). Functional assessment using optical action potential recordings showed a small decrease in action potential durations in CD34 KO 1 cells but no significant changes in key pacemaker characteristics including beating rates, maximum upstroke velocity, and diastolic depolarization compared to AAVS1 Control cells (Fig. 9d). The reduced action potential durations are most likely caused by the small, but not significant, increase in beating rates in the CD34 KO 1 cells and the fact that a faster beating rate results in shorting of the action potential38. Taken together, these data support the notion that CD34 marks SAN pacemaker cells but does not play a role in SAN pacemaker lineage specification and function.

Fig. 9: CD34 knockout does not impact SAN pacemaker differentiation and function.
figure 9

a Flow cytometric analyses at day 25 of CD34 and NKX2–5:GFP expression in SIRPA+CD90 cardiomyocytes in AAVS1 Control, CD34 KO 1, and CD34 KO 2 SANLPC differentiations. b Bar graphs summarizing the percentage of NKX2–5 myocytes (left) and NKX2–5+ myocytes (right) in the indicated differentiation cultures (n = 3 independent differentiations). c RT-qPCR analysis of the expression of the indicated genes in SIRPA+CD90NKX2–5 FACS sorted cells from day 25 AAVS1 Control and CD34 KO 1 differentiations (n = 3 independent differentiations). Statistical analysis was performed using two-sided unpaired t-test. Datasets that failed the normality test (MYH11) were analyzed using two-sided Mann-Whitney test: *p < 0.05 vs indicated sample. d Optical action potential parameters recorded in day 25 FACS sorted SIRPA+CD90NKX25 cardiomyocytes from AAVS1 Control and CD34 KO 1 differentiations (AAVS1 Control, n = 48 (n1 = 16, n2 = 16, n3 = 16) and CD34 KO 1, n = 38 (n1 = 12, n2 = 13, n3 = 13) cell aggregates from three biological replicates (independent differentiations)). Statistical analysis was performed using two-sided unpaired t-test. Datasets that failed the normality test (APD90) were analyzed using two-sided Mann-Whitney test: *p < 0.05, **p < 0.01 vs indicated sample. Error bars represent SEM. Violin plot elements: center line, median; lower line, first quartile; upper line, third quartile. APD30/90, Action potential duration at 30%/90% repolarization, KO knockout. Source data are provided as a Source Data file.