Single-cell RNA sequencing reveals key regulators and differentiation trajectory of iPSC-derived cardiomyocytes

Single-cell RNA sequencing analysis of cardiac directed differentiation

To further understand the genetic regulation of cardiovascular development, we reanalyzed the scRNA-seq data at four time points during the differentiation of iPSCs into cardiomyocytes11. The data were derived from single-cell transcriptional profiling of human iPSCs induced using a chemically defined cardiac differentiation kit (Cellapy, CA2004500, China). Two human iPSC lines (CA4024106, cell line 1 M-iPS; CA4027106, cell line 2 F-iPS) derived from healthy males and females using the Cytotune-iPS 2.0 KOSM transgenic system were used as parental cell lines for the study11. Cells were collected at days 0, 2, 4, and 10 of differentiation for large-scale scRNA-seq (10x genomics) (Fig. 1a). The Seurat R package was used for log-normalization and data analysis. Cells with a significantly abnormal number of genes (> 8000) were considered potential multiplets and were excluded from subsequent analysis. Similarly, cells with a percentage of mitochondrial genes ≥ 10% also were excluded. We collected a total of 32,365 cells from the four sample groups, including 11,654 cells on day 0, 6903 cells on day 2, 8724 cells on day 4, and 6272 cells on day 10, resulting in a total of 2,066,741,896 high-quality clean reads.

Fig. 1
figure 1

Single-cell RNA sequencing analysis of cardiac directed differentiation. (a) Schematic of the experimental design. (b) Gene expression across all cells at individual time points. (c) UMAP (upper panel) and t-SNE (lower panel) plot of single-cell clustering from all four time points. (d) Number of genes and UMIs per cluster. (e) Number of cells per cluster. (f) The proportion of cells per cluster at four time points. (g) UMAP plot at each time point. (h) t-SNE plot at each time point.

To confirm that the differentiation followed the known cardiac differentiation trajectory, we analyzed the expression of known genes. On day 0, the human iPSC lines were positive for pluripotency marker genes including T, NANOG, and POU5F. On day 2, previously reported mesoderm markers such as T, MIXL1, and EOMES were sequentially activated, while the expression of pluripotency markers SOX2 and NANOG was gradually downregulated (Fig. 1b). The expression levels and cell numbers of cardiac and cardiomyocyte markers significantly increased at day 10 of cardiac differentiation. These data show that cardiac-directed induction of differentiation produces distinct cell populations and displays expected time-specific transcriptional profiles. Non-linear dimensional reduction of the scRNA-Seq data by t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) revealed 9 distinctly separated cell populations (Cluster 0–8) (Fig. 1c). Individual clusters have variable numbers of UMIs and genes (Fig. 1d). The proportion of cells per cluster at different stages during differentiation reflects the differentiation process of cells within each cluster (Fig. 1e–f). The cell numbers in clusters 1, 2, and 4 continued to decrease from day 0 to day 10, while the number of cells in cluster 6 increased continuously (Fig. 1g–h). Notably, cell numbers in clusters 5 and 0 peaked on day 2 and day 4, respectively, before declining.

Identify the iPSC-CMs cluster

To identify the iPSC-CMs and other cell clusters that may exist during differentiation, we visualized the expression of known marker genes (Fig. 2a). Pluripotency biomarkers (e.g., T, NANOG, POU5F1) were expressed in cell clusters 1,2,3 and 4, while typical primitive streak (PS) markers, such as MIXL1, T-brachyury, EOMES, and MIXL1 was found in cluster 5. DNMT3B, a rejuvenation signature12, also exhibited higher expression in clusters 1, 2, and 4. The transcription factor genes associated with cardiac lineage, which play a decisive role in cardiac development and formation, including Nkx2-5, TBX5, GATA4, Isl1, HAND1/2, and MEF2C, were activated in clusters 0 and 6 (Fig. 2a). The early expression of the zinc finger transcription factor GATA4in precardiogenic splanchnic mesoderm, which can activate the expression of certain cardiac-specific genes, is crucial for normal cardiac morphogenesis and is highly expressed in cluster 0/613,14. Consistently, the LIM-homeodomain transcription factor Islet-1 (Isl1), a marker of cardiac precursors during differentiation, serves as a pioneer factor driving cardiomyocyte lineage commitment15. ISL1 is vital for orchestrating proper cardiogenesis and establishing the epigenetic memory of cardiomyocyte fate commitment15, with its expression limited to clusters 0 and 6. Structural genes encoding sarcomeric-related proteins expressed in differentiated cardiomyocytes, including MYL7, MYH6, TNNI1, TTN, and TNNT2, were found in clusters 0 and 6, with the highest expression observed in cluster 6 (Fig. 2a). Gene Set Enrichment Analysis (GSEA) revealed that Cluster 6 significantly enriched metabolic pathways, including “Dilated cardiomyopathy”, “Hypertrophic cardiomyopathy (HCM)”, “Cardiac muscle contraction” and “Adrenergic signaling in cardiomyocytes” (Fig. 2b). The expression of genes that have been reported to be involved in cardiomyocyte development, including PPARA, FABP3, BMP5, BMP7, ERBB2, and ERBB416, exhibited significantly increased expression in cluster 6 (Fig. 2c). The PPAR pathway, which is activated during cardiomyocyte differentiation17, leads to increased expression of FABP3, a regulator of cardiac metabolism18. The trabecular genes ERBB2 and ERBB4 are also highly expressed in cluster 6 (Fig. 2c), potentially promoting cell differentiation or proliferation19. Consistent with the study by Cui et al., BMP signaling is involved in regulating cardiac development16, and BMP ligands BMP5 and BMP7 are expressed at high levels in cardiomyocytes (Fig. 2c). Based on this information, cluster 6 was identified as a true iPSC-CM cluster.

Fig. 2
figure 2

Identify the iPSC-CMs and other cell clusters. (a) Expression of specific markers in all clusters. The shade of red color in the UMAP plot reflects the relative expression level of corresponding genes. (b) Pathway enrichment of highly expressed genes in cluster 6. The abscissa represents the gene ratio, the size of dots represents the the number of genes, and the color of dots represents the p values. (c) The gene expression levels of each cluster are visualized by violin plots. (d) UMAP plots identified nine cell types separated into distinct clusters. (e) Heat maps showed highly expressed genes of nine clusters.

On the other hand, cluster 8 expressed relatively high levels of smooth muscle cell markers TAGLN, TNN1, and ACTA2 (α-SMA) (Fig. 2a). Cluster 7 cells exhibited high expression of PAX3 (Fig. 2a), an ectodermal marker and an important transcription factor in the PAX family, which is closely related to the brain and neural crest during embryonic development20,21. However, cluster 7 did not express other ectoderm genes, but highly expressed pluripotency genes, suggesting that these cells may represent an undifferentiated population.

Taken together, these data show that iPSCs differentiate into PS mesoderm (cluster 5), cardiac progenitors (cluster 0), cardiomyocytes (cluster 6), and smooth muscle cells (cluster 8) (Fig. 2d). The highly differentially expressed genes across the 9 clusters are shown in the heatmap (Fig. 2e).

Reconstruction of the developmental trajectory of cardiac differentiation from human iPSCs

To track the differentiating cells from day 0 to day 12, we performed a pseudotime analysis, revealing that the pluripotent stem cells gradually developed into committed and definitive cardiomyocytes, with obvious branching and turning observed on day 2 (Fig. 3a). From cluster 0 onward, the cells are committed to the cardiac lineage (Fig. 3a). In order to understand the potential biological functions of the clusters, enrichment analysis revealed specific pathways related to cardiac differentiation (Fig. 3b). In cardiac progenitors (Cluster 0), cardiomyocytes (Cluster 6), and SMCs (Cluster 8), the TNF-β signaling pathway arose as a major biological event, indicating its close association with cardiomyocytes development. The role of TGF-β signal transduction in the cardiac outflow tract and cardiac valve has been reported22,23. In addition, genes involved in extracellular matrix (ECM) receptor interactions were enriched in these three clusters, which may be due to the necessity of cardiac ECM synthesis for supporting tissue growth in early cardiac morphogenesis24. From cluster 0 onwards, the cells are clearly committed to the cardiac lineage. Unlike clusters 0 and 8, cluster 6 showed significantly high expression of oxidative phosphorylation genes, which is consistent with the high dependence of cardiomyocytes on mitochondrial oxidative metabolism. In contrast, iPSs exhibited marked enrichment for DNA replication and base excision repair (Fig. 3b). RNA velocity, which distinguishes between unspliced and spliced mRNAs in single-cell RNA sequencing, infers the direction of gene expression changes, allowing predictions of future individual cell states25. As shown in Fig. 3c, the direction and magnitude of the velocities are projected onto the UMAP plot as arrows, with cardiac progenitor cells (cluster 0) flowing toward cardiomyocytes (cluster 0) as expected, while clusters 2 and 4 flow toward cluster 5. The pseudotime analysis on days 0 and 2 revealed that cluster 2 had a greater potential for transitioning into cluster 5 (Fig. 3d–e). Based on these results, we constructed the developmental trajectory of iPSC-CMs differentiation in this study (Fig. 3f). The iPSCs in this study, consistent with most studies26,27,28, developed into cardiomyocytes following the same steps e same steps observed in human embryonic stem cell cardiac development (Fig. 3f).

Fig. 3
figure 3

Reconstruction of the developmental trajectory of cardiac differentiation from human iPSCs. (a) Reconstruction of cell differentiation trajectory from day 0 to day 10, with the cardiomyocytes highlighted in the pseudotime trajectory. (b) The heat map showed pathway enrichment of differentially expressed genes in all clusters. (c) The UMAP plot of RNA velocities shows the dynamics and trajectory of mRNA transcription states. The direction of cell state transition and RNA velocities are projected onto UMAP embedding as streamlines. (de) Reconstruction of cell differentiation trajectory from day 0 to day 2. (f) Schematic representation of cardiomyocyte differentiation from human iPSCs in this study. (g) Heatmap showing the number of cell–cell interactions between cell clusters, predicted by the CellphoneDB. (h) Dot plot showing the selected ligand-receptor pairs between clusters. Dot size and color represents the scaled mean of the expression level of the two interacting molecules in their respective clusters. The red outline indicates significance.

Intercellular communication mediated by ligand-receptor complexes is essential for coordinating a variety of biological processes29. To explore the crosstalk between different cell types during cardiomyocyte development, CellPhoneDB was used to study potential receptor-ligand interactions (Fig. 3g–h). Our analysis reveals that interactions among cardiomyocytes were the most prevalent during this process. Additionally, the interactions of receptor-ligand between cardiomyocytes and other clusters were notably abundant, particularly with cardiac progenitor cells, suggesting a significant connection between these two clusters (Fig. 3g–h).

Important transcription factor regulating cardiac lineage commitment

To further characterize the molecular changes surrounding cardiac lineage specification, we focused on clusters 0 and 6. Using the Louvain clustering algorithm, we identified three subclusters, with cardiac progenitors (cluster 0) divided into subclusters 0–1 and 0–2, according to their expression profiles (Fig. 4a). Both subclusters expressed cardiac progenitor markers, but subclusters 0–2 demonstrated a greater potential to differentiate into cardiomyocytes (Fig. 4b). Further analysis indicated that upregulated genes in subcluster 0–2 were significantly enriched in ECM, MET activates PTK2 (FAK) signaling, and TGF-β signaling pathway, indicating that activation of these pathways may be involved in cardiomyocyte differentiation (Fig. 4c–e). The composition and distribution of ECM in the heart are dynamic, and ECM is indispensable for promoting cardiac regeneration24. Furthermore, the role of the TGF-β superfamily in cardiac development and the populating of the embryonic heart with cardiomyocytes has been well documented30. Conversely, cell cycle-related pathways were significantly inhibited, indicative of emerging cardiomyocytes (Fig. 4c–e). However, MET/FAK has been found to regulate the cell cycle, and its inhibition is associated with decreased cancer cell proliferation31.

Fig. 4
figure 4

Important transcription factor that regulates cardiac lineage commitment. (a) Cluster 0 and 6 are further analyzed by subclustering. (b) Reconstruction of cell differentiation trajectory of subclusters. (c, d) Pathway enrichment analysis with REACTOME and KEGG is shown for differentially expressed genes between subcluster 0–1 and subcluster 0–2. The abscissa represents the Normalized Enrichment Score (NES), the size of dots represents the Gene Ratio of differentially expressed genes in this pathway, and the color of dots represents the adjusted p values. (e) GSEA revealed significant pathway activations or inhibitions within subclusters 0–2. Two-tailed Student’s t test p values are indicated. (f) Heat maps show highly expressed genes in subclusters 0–1 and 0–2. (g) A selection of the regulons identified using SCENIC analysis (rows), and their activities in each cell (columns).

Analysis of differentially expressed genes (DEGs) between the subclusters 0–1 and 0–2 revealed that CREG1, CCBE1, and NODAL were significantly upregulated in subcluster 0–2 (Fig. 4f). Notably, the cellular repressor of E1A-stimulated genes (CREG) has been shown to be a vascular remodeling mediator and its overexpression inhibits the proliferation of vascular SMCs while promoting their differentiation32,33. In addition, the interaction between CREG1 and the cytolytic component Sect. 8 promotes the differentiation of ES cells into cardiomyocytes and is essential for assembling intercellular connections34. This suggests that CREG1 may be an important transcription factor that regulates cardiac lineage commitment. The collagen- and calcium-binding EGF-like domains 1 (CCBE1) is a secreted protein that enhances VEGF-C signaling and is critical for lymphangiogenesis during development. It has recently been found to be required for the differentiation of embryonic stem cell-derived early cardiac precursor lineages35. NODAL was originally considered a mesoderm inducer in vertebrates36, but recent findings indicate its close relationship with proper cardiac specification and differentiation. Disruption of Nodal/TGFβ pathway signal can lead to abnormal cardiac and endoderm development37. Based on the above information, we speculate that CREG1, CCBE1, and NODAL are important genes that control cardiac lineage commitment from early mesodermal precursor cells.

To further identify regulators of cardiomyocyte differentiation, we performed single-cell regulatory network interference and clustering (SCENIC) analysis38, which identifies differential expression of regulons (transcription factors and their putative direct targets). Analysis of the regulons enriched in subclusters 0–1, 0–2, and 6 showed that although some regulons (including ARNTL2, NFIA, ZNF649) are shared among these three clusters, there are differentially expressed regulons in each cluster (Fig. 4g). Especially the transcription factors of the HOX family, including HOXB3, HOXA2, HOXD4, HOXD3, and HOXC5, are activated in cluster 6, as HOX proteins are required during heart development39. PPARA, which belongs to the PPAR family, is also activated in cluster 6. This family consists of three members, PPARα, PPARβ (also called PPARb/d or PPARδ), and PPARγ40. PPARα is found in different cardiomyocytes and is known to be essential for cardiac energy metabolism16,41. Another transcription factor expressed in 6, ZIC2, was identified in a genome-wide CRISPR-knockout screen for cardiac differentiation from hESCs and has been proven to be a key gene in determining cardiac progenitor cell fate during human cardiogenesis42. Xiong et al. suggested that Lim homeobox transcription factor 1α (LMX1A)may be involved in the process of limonin-mediated cardiac repair following myocardial infarction43. The transcription factor Forkhead box G1 (FOXG1), a nuclear-cytosolic transcription factor expressed in the developing nervous system, is essential for forebrain development11,44. Nuclear receptor subfamily 2, group F, member 2 (NR2F2) encodes the key nuclear receptor COUP-TF2, whose altered expression is associated with cardiovascular diseases, especially atherosclerosis and congenital heart defects (CDH)45. However, the exact roles of subgroup 6 specific TFs in human cardiac lineage commitment warrant further investigation.

FOXA2, FOXG1, and FOXI3 are activated in subclusters 0–2 or 6. The forkhead box (FOX) proteins constitute a prominent family of transcription factors involved in a wide range of cellular processes. FOXA2 plays a key role in embryonic patterning and endoderm formation, and also regulates normal heart development46. FOXG1, a transcription factor that controls the proliferation of neural progenitor cells during brain development, was recently found to regulate epicardial proliferation47. FOXI3 is a regulator of ectodermal development48. Based on the above information, we hypothesize that transcription factors FOXA2 and FOXG1 are potentially significant in the cardiomyocyte differentiation process, warranting further exploration.