scRNA-seq revealed transcriptional signatures of human umbilical cord primitive stem cells and their germ lineage origin regulated by imprinted genes

CD133+LinCD45 VSELs are less common in UCB as compared to a CD34+LinCD45 VSELs

Human VSELs are small, lack lineage markers, and express CD34 or CD133 antigens 29. Therefore, to purify UCB VSELs we employed multiparameter staining of UCB mononuclear cells (MNCs) and sorted two populations defined as CD34+linCD45 and CD133+linCD45 cells. Supplementary Fig. 4 displays representative dot plots of our sorting strategy.

Briefly, MNCs were separated from the hUCB unit by gradient centrifugation on Ficoll-Paque. Subsequently, MNCs were stained with a cocktail of antibodies as described in Materials and Methods. Next, small objects 2–15 μm in size were included in the lymphocyte gate (Supplementary Fig. 4A) and further analyzed for the expression of lineage markers, where lineage negative events (Lin-) were gated (Supplementary Fig. 4B), and two dot plots were applied to identify the population of CD34 + lin-CD45- VSELs and CD34 + lin-CD45 + HSCs (Supplementary Fig. 4C), as well as CD133 + lin-CD45- VSELs and CD133 + lin-CD45 + HSC (Supplementary Fig. 4D) respectively.

Transcriptional states in HSCSs and VSELs and evaluation of subpopulations by unsupervised clustering

To explore the composition and diversity of HSCs and VSELs in human UCB, we performed droplet-based scRNA-seq (10 × Genomics, USA) using cells isolated from Ficoll-Paque processed UCB by employing for purification MoFlo Astrios EQ Cell Sorter. We captured a total of 5862 CD133+linCD45+, 6811 CD34+linCD45+, 3706 CD133+linCD45 and 1592 CD34+linCD45, which passed quality control, with a median of 2066, 1613, 457 and 1794 genes per cell, respectively. The more detailed information describing the estimated number of cells detected in all samples and the mean number of reads per cell after quality control is shown in Supplementary Table 2.

Initially, CD133+linCD45 and CD34+linCD45 VSELs as well as CD133+linCD45+ and CD34+linCD45+ HSCs were analyzed separately to characterize their diversity and separation into subpopulations by employing CellRanger (CellRanger Software, version 7.2.0) and visualized in Loupe Browser (version 7.0.1). In all the studied samples by the t-SNE method (Loupe Browser 7.0.1), we identified in a bulk cell population (Fig. 1A) 22 subpopulations (graph-based clusters) (Fig. 1B and C). Next, by the uniform manifold approximation and d projection (UMAP) 53,54 in the dataset, we identified several cell subpopulations e.g., 13 in CD34+linCD45+ (Supplementary Fig. 5A), 11 in CD133+linCD45+ (Supplementary Fig. 5B) and 8 in both: CD34+linCD45 (Supplementary Fig. 5C) and CD133+linCD45 VSELs (Supplementary Fig. 5D). Furthermore, additional analysis, where all samples were aggregated into one data set (CellRanger Software, version 7.2.0) and then visualized in Loupe Browser (version 7.0.1) allowed us to confirm the diversity of studied cells but also to identify unique subpopulations exclusive for both VSELs datasets (marked in blue on the Fig. 1C. This subpopulation was more prominent in CD34+linCD45 VSELs (Fig. 1C). We will later characterize at the molecular level this subpopulation while presenting data of differential gene expression between CD133+linCD45 and CD34+linCD45 VSELs datasets.

Fig. 1
figure 1

Identification of cell populations (A) and subpopulations (clusters) (B) in all studied samples: CD34 + lin-CD45-, CD133 + lin-CD45- (VSELs) and CD34 + lin-CD45 + , CD133 + Lin-CD45 + (HSCs) visualized by Loupe Browser 7.0.1 (A). Human umbilical cord blood isolated HSCs and VSELs, plot as distinct clusters by a t-SNE method (B). Unique subpopulation of both VSEL datasets, more prominent in CD34 + Lin-CD45- (C—marked in blue).

To obtain more information on differential gene expression in the identified VSELs subpopulations, a detailed analysis was performed employing Seurat (version 5.0.1). This time, the dimensional reduction was based on the UMAP53,54. An advantage of UMAP over t-SNE, is that the former clearly separates cluster groups of similar genes from each other, allowing for a better and faster preservation of the data’s structure53,54. All samples were analyzed separately, and cell subpopulations were identified and subjected to gene expression analysis to find differentially expressed ones. The number of clusters identified with UMAP projection (Fig. 2) was slightly different in comparison to t-SNE method implemented before in Loupe Browser (version 7.0.1) and comprised of 14 clusters identified for HPCs (CD34+linCD45+ and CD133+linCD45+) (Fig. 2A and B) and 10 clusters identified for VSELs (CD34+linCD45 and CD133+linCD45-) (Fig. 2C and D). Based on the results from both methods: t-SNE and UMAP, a higher level of homogeneity was observed in both HSCs samples, while in VSELs, the smaller number of clusters and their greater separation suggests higher heterogeneity. The clear separation into distinct clusters in UMAP (Figs. 3A-C), suggests that VSELs clusters might represent transcriptional states of discrete subtypes of primitive cells. Among subpopulations of CD34+linCD45 and CD133+linCD45, we identified clear transcriptional signature of primordial germ cell specification (Fig. 3B and C), early stages of developmental lineage commitment (Fig. 3B), quiescence (Fig. 3B and C), endothelial specification and activation (Fig. 3B and C), early innate and adaptive immune specification (Fig. 3B and C), and megakaryocyte and platelet development (Fig. 3B and C). Furthermore, we identified a unique subpopulation characteristic for CD34+VSELs (Fig. 3A).

Fig. 2
figure 2

Identification of subpopulations in all studied samples of HSCs (A and B)—CD34 + lin-CD45 + (A), CD133 + lin-CD45 + (B), and VSELs (C and D) CD34 + lin-CD45- (C) and CD133 + lin-CD45- (D) visualized by UMAP method. The datasets are first analysed without integration. The resulting clusters are defined both by cell type Uniform manifold approximation and projection (UMAP) plot of HSC and VSELs clusters (n = 1592 cells of CD34 + Lin-CD45- and n = 3706 cells of CD133 + Lin-CD45-).

Fig. 3
figure 3

Integration analysis of VSEL libraries defined by CD34 + lin-CD45- and CD133 + lin-CD45- phenotypes (A). Integrating two scRNA-seq datasets together (CD34 + lin-CD45- and CD133 + lin-CD45-) idientiefied shared/homologous regions and visualized differences between two subtypes of VSELs with indication of unique clusters 1 and 2 found in CD34 + lin-CD45- (marked by circle on Panel A). Based on Reactome and gene ontology (GO) term pathway enrichment analysis of top 50 up-regulated genes, manual curation of enriched genes was performed to assign labels to each VSEL cluster or state of CD34 + lin-CD45- (B) and CD133 + lin-CD45- phenotypes (C). Unintegrated UMAP with Leiden clustering of two VSELs scRNA-seq datasets. Note the PGC-like/quiescent clusters on Panel B that corresponds to marked by circle clusters 1 of Fig. 1A.

Molecular analysis of UCB purified CD34+ and CD133+ VSELs subpopulation reveals their germ line and hematopoietic specification potential.

To find differentially expressed features (cluster biomarkers), the FindMarkers function in Seurat was used to characterize all the subpopulations within CD34+VSELs and CD133+VSELs. The list of positive markers in a single cluster is demonstrated in Tables 1 and 2 (in online) for CD34+VSELs and CD133+VSELs, respectively. We examined the gene markers that define each cluster, separately for CD34+ and CD133+ VSELs. The Reactome and Gene Ontology (GO) term pathway enrichment analysis, based on the top 50 up-regulated genes and manual curation of enriched genes allowed us to assign labels to each VSELs cluster or state (Fig. 3B and C).

In CD34+ dataset, we have identified in cluster 0 (Table 1 in online) the expression of IL-7R, CAMP4, CD3D, CD3G, TCF7, LTB, BACH2, SKAP1, TRAC, and CD52. Cluster 1 (Table 1 in online) revealed genes involved in cell quiescence, proliferation, and differentiation, including H19, Igf2, S100A16, NNMT, ANXA2, and GNG11. Interestingly, this cluster also comprised DNA-binding protein inhibitors 1 and 3 (ID1 and ID3), that can bind and inhibit transcriptional activity of basic HLH proteins. Clusters 3 and 10 were enriched for genes regulating megakaryocyte and platelet development (Table 1 in online, Fig. 3B).

We reported in the past that BM purified murine VSELs express several markers indicating their close relationship to migrating primordial germ cells (PGCs)25 and, that murine VSELs similar to murine PGCs remain quiescent due to the erasure of some genes regulated by paternal imprinting24. Furthermore, we demonstrated that murine BM-derived VSELs like human UCB-purified VSELs may differentiate in appropriate experimental models into HSCs39,40. Therefore, we focused on changes in PGCs-related imprinted genes in CD34+ and CD133+ VSELs and their hematopoiesis specification potential.

As expected the prominent cluster 1 for CD34+VSELS expressed genes regulated by paternal imprinting being responsible for quiescence of VSELs such as: H19 (avg_log2FC = 2.75; adj. p value = 2.26E-89) and IGF2 (avg_log2FC = 0.95; adj. p value = 2.28E-51) and displayed a decreased expression level of IGF2R (avg_log2FC = -1.57; adj. p value = 6.45E-09) (Table 1 in online). We eliminated the possibility that the “quiescent/germ like” cluster seen in VSELs arose due to contamination of the cell during the sorting procedure of VSELs since HSCs did not display “quiescent/germ like” cluster and the proportions/size of individual subpopulations was different (Fig. 1). Furthermore, the analysis of selected clusters allowed us to find gene patterns differentially expressed between the individual clusters within the same sample.

On the other hand, CD133+VSELS cluster 5 (Fig. 3C, Table 2 in online) was characterized by increased expression of imprinted genes: H19 (avg_log2FC = 4.17; adj. p value = 6.68E-177), MEG3 (avg_log2FC = 1.07; adj. p value = 1.48E-71), lnc RNA controlled by IGF1 that tightly regulate proliferation SNHG7 (avg_log2FC = 2.06; adj. p value = 2.63E-60), LNCAROD, that is LncRNA activating regulator of DKK1 (avg_log2FC = 0.77; adj. p value = 1.10E-52); neuron fate commitment: ZNF (avg_log2FC = 1.38; adj. p value = 3.68E-131), TCF4 (avg_log2FC = 1.83; adj. p value = 1.37E-61); epithelial/mesenchymal transition via long non-coding RNA (SNHG8); immunomodulatory: STMN1 (avg_log2FC = 2.76; adj. p value = 1.89E-110), RPS3 that is an essential subunit of NF-kappaβ involved in the regulation of key genes in rapid cellular activation responses (avg_log2FC = 2.52; adj. p value = 3.53E-52) and, cell division protein kinase 6 (CD6), (Table 2 in online, Fig. 3C).

Thus, scRNA-seq analysis confirms that quiescence/proliferation of both CD34+ and CD133+ human VSELs is regulated in a similar way, as previously reported for murine cells24 by silencing of INS/IGFs signaling. In addition, in both datasets of VSELs we found high expression of RACK1, which is an insulin-like growth factor 1 (IGF-1) receptor-interacting protein that can regulate IGF-1-mediated Akt activation 55, and IGF2, an insulin-like growth factor (IGF)-binding protein (IGFBP).

As mentioned above BM- and UCB-derived VSELs can acquire CD45 antigen expression and in appropriate experimental models could be specified into HSCS39,40. To support this, we observed in CD34+VSELS in clusters 0, 4, 5, 6, 7, and 8 (Fig. 4A and B), the expression of CD45 (PTPRC). What is important we did not observe the expression of this gene in clusters assigned as “quiescent/germ like” (cluster 1) and “development” (cluster 2) as well as those connected with erythrocytes and platelets functions (clusters 3, 9 and 10). In the remaining clusters of CD34+ VSELs, we identified in clusters 0, 5 and 8 the T lymphocyte markers (CD2, CD3, CD4), NK cells markers (CD16 (FCGR3A) and CD56 (NCAM1) in cluster 8; B lymphocyte marker (CD19) in cluster 7; monocyte marker (CD14) in cluster 5; granulocytes (mostly neutrophils) marker (CD66b/CEACAM8) in cluster 6; platelet marker CD41 (ITGA2B) in cluster 10 and, erythrocyte marker (GYPA) in cluster 3 and 9. Furthermore, we identified the expression of mRNA for CD34 in clusters 1 and 2. At the same time, the expression of PROM1 (CD133), which is a very early marker of stem and progenitor cells, was only slightly detectable in single cells of clusters 1 and 2. Additionally, among additional markers for HSCS, we found the expression of PECAM1 (CD31) in clusters 1, 2, 5, and 7 and c-KIT (CD117) in a few cells in cluster 2 (Fig. 4A and B). Notably, clusters 1 and 2 of CD34+ dataset (Fig. 4B), which are characterized by transcriptome of early developmental and primordial specification genes, are the only ones that express both CD34 and PROM1 (CD133) (Fig. 4A and B).

Fig. 4
figure 4

Exploring heterogeneity of VSELs population using Seurat/Expression level of cell markers identified in all clusters of both datasets of VSELs. To identify cell type markers manual analysis of the list of differentially expressed genes in each cluster of CD34 + lin-CD45- and CD133 + lin-CD45- was performed. Expression levels for CD2, CD3D, CD4, CD14, CD16 (FCGR3A), CD19, CD34, CD133 (PROM1), CD41 (ITGA2B), NCAM1, CD68, CD45 (PTPRC), GYPA, CD66b (CEACAM8), PECAM1 (CD45), CD117 (c-KIT) and CD11b (ITGAM), for CD34 + lin-CD45- (Panel A) and CD133 + lin-CD45- (Panel C) visualized by dot plots (Panel A and C) and violin plots in each cluster (Panels B and D).

Similarly, in CD133+VSELs data sets (Fig. 4C and D), the expression of several markers was confirmed, however with slight differences as compared to CD34+VSELS. The expression of mRNA for CD45 (PTPRC) was found in almost all clusters except for clusters 4 and 9, corresponding to erythrocyte and platelet specification, respectively. Notably, cluster 5 of CD133+ VSELs dataset, which constitute of most primitive subset of cells, correlated with the expression of both CD34 and CD133 (Fig. 4C and D). Moreover, in CD133+ VSELs we identified clear transcriptional signature of cells expressing fetal hemoglobin F subunit gamma 1 (HBG1), fetal hemoglobin F subunit gamma 2 (HBG2), GYPA, ITGA9 (Fig. 5A and B, Table 1 and 2 in online). We also identified monocyte progenitor markers MPEG1 and CD33, and mast cell mRNA species encoding GATA2, CD63, and HDC (Table 1 and 2 in online) as well as markers for NK (CD96), T-cell (CD3D), and B cell (CD19, CD79B and FCRL1) development.

Fig. 5
figure 5

Identification of highly variable genes in VSELs datasets. Volcano plots displaying differences in gene expression among CD34 + lin-CD45- (Panel A) and CD133 + lin-CD45- (Panel B). Genes shown in red represents variable genes and in black – non variable counts.

Therefore, based on cell markers analysis, we further confirmed as expected that CD34+ and CD133+ UCB-derived VSELs at further stages of development may become specified to the myeloid (erythrocytes, granulocytes, and monocytes) and lymphoid lineage (lymphocytes T, B and NK cells). The most primitive clusters 1 and 2 for the CD34 + lin-CD45- (Fig. 4A and B) and cluster 5 for CD133 + lin-CD45- (Fig. 4C and D) were characterized by the expression of CD34, PROM1, PECAM1, and c-KIT.

Global gene expression dynamics and differentially expressed genes (DEG) between CD34+VSELs and CD133+VSELs

Using Seurat (version 4.4.0) we plotted the top 10 highly variable genes for CD34+VSELs and CD133+VSELs (Fig. 5A and B). The most highly variable genes in CD34+VSELs turned out to be HBD, IGKC, PPBP, GNLY, CA1, IGLC2, GP1BB, S100A9, NKG7 and IGHM (Fig. 5A). All indicated genes regulate hematopoiesis and the immune system. Interestingly, the top 10 most variable genes for CD133+VSELs included the similar markers: IGKC, GNLY, HBD, PPBP, IGLC2 and GP1BB (Fig. 5B). Additionally, in CD133+VSELs we identified IGLC3, HBZ, PF4 and H19, involved in hematopoietic and immune processes, and H19, which as mentioned above is non-coding RNA and a product of paternally imprinted Igf2-H19 locus giving rise to some microRNAs regulating quiescence of VSELS24. Furthermore, in both datasets we identified expression of genes involved in hemostasis, gap junctions, membrane dynamics and repair, fibrinolysis, caveolae formation (e.g., ANXA2, CAV1, S100A10, CALM1, GP6).

We also generated an expression heatmap for CD34+ (Fig. 6A) and CD133+ VSELs (Fig. 6B) focusing on clusters. The top 10 markers for each cluster of both data sets confirmed distinct molecular signature except for clusters 1 and 2 for CD34+LinCD45 VSELs (Fig. 6A) and clusters 0 and 1 for CD133+LinCD45 VSELs (Fig. 6B). The observed differences prompted us to conduct a comparison of gene expression between both VSELs populations using the integration analysis in Seurat. The results of the DEG analysis suggest that VSELs comprise several subpopulations with different gene expression levels related to specific functions. Most cells in VSELs clusters 1 and 2 (CD34+) and 4 and 5 (CD133+) were characterized by a growth repressive profile with high expression of messages related to paternally imprinted H19 and MEG3 genes that are responsible for the quiescent state of VSELs. The growth repressive transcriptome for CD34+ VSELs was most pronounced for clusters 1 and 2 including mRNA for H19, Meg3, ID1, and ID3 (Fig. 7A). Also, these clusters were associated with genes involved in the early stages of germ line specification e.g., Sox7, Sox17, and Sox18 (Fig. 7A). For a population of CD133+VSELs, a growth repressive transcriptome profile was observed in cluster 5 (Fig. 7B). This finding is consistent with the results of the DEG analysis, in which cells in VSELs clusters 1 and 2 highly express growth repressive genes.

Fig. 6
figure 6

Heatmap of PCA for all clusters showing the expression levels of principal components across different cell clusters identified in the scRNA-seq datasets. The top 10 markers for individual clusters (n = 10) of CD34 + lin-CD45- (Panel A) and CD133 + lin-CD45- (Panel B) were plotted.

Fig. 7
figure 7

Molecular profile of differentially expressed genes (DEGs) involved in early stages of germ line specification, genomic imprinting, and epigenetic changes both methylation and demethylation processes. Violin plots of H19, IGF2, IGF2R, S100A16, ID1, ID3, Sox7, Sox17, Sox18, Meg2m POU3F1, POUF51, DNMT1, DNMT3a, DNMT3b, MTIF2, Ezh2, SIRT1, SUZ12, EED, UHRF1, TET1, TET2, TET3, TDG, GADD45A, GADD45B, APOBEC3G, APOBEC3F, and SMUG1 expression in CD34 + lin-CD45- (Panels A, C and E) and CD133 + lin-CD45- (Panels B, D and F).

Furthermore, the population of CD34+ VSELs specified for proliferation was reflected by the transcriptome profile of methylation/histone modification related genes (Fig. 7C – clusters 7 and 8). Violin plots demonstrated the expression of DNA methyltransferases and sirtuins related to growth restriction/proliferation of cells. For a population of CD133+VSELs methylation and histone modification related genes were associated with clusters 3 and 7 (Fig. 7D).

As mentioned above, we have identified a unique mechanism controlling the biology of murine VSELs in the adult tissues based on the differential expression of imprinted genes regulated by the methylation of the differentially methylated regions (DMRs) in some parentally imprinted genes. In cluster 1 we identified several genes significantly upregulated compared to other subpopulations including, H19, Igf2, Sox7, BMP4, FoxF1, LFNG, EPHA4, SNAI1, GATA-6, MSX1, Sox17, POU3F1, DLL1, TEAD2/4, TCF7L1, MAMLD1, FOXC1, FOXC2, YAP1, and FGF2 (Fig. 7A and C) We also reported that quiescent VSELs to proliferate need to re-establish proper methylation that is erased on paternally imprinted DMRs. This process is regulated by DNMTs and sirtuins56,57. As demonstrated by our group, a subset of imprinted and germline developmental genes have their DNA methylation levels closely regulated by Sirt1, which inhibits the Dnmt3L protein at the transcriptional and protein stability levels58. Therefore, to shed more light on the quiescence of UCB-derived VSELs we focused in our single-cell cluster analysis on different components of DNMTs and sirtuin signaling. Using scRNA-seq we identified key player(s) involved in the regulation of genome methylation in VSELs clusters, including Ezh2, SIRT1, Suz12, EED, DNMT1, DNMT3A, DNMT3B, and UHRF1 (Fig. 7C and D). Interestingly these genes could be involved in maintaining gene repressive state by polycomb group proteins. Accordingly, some of these proteins form the PRC2 complex containing Ez that catalyzes methylation of histone H3 lysine 27 (H3K37me2/3) and represses its transcription. Importantly, while Ezh1 and Ezh2 maintain repressive chromatin structure59, SIRT1 affects DNA methylation of polycomb group protein target genes60. Suz12 expression is essential for the activity and integrity of the PRC2 complex and is required for X chromosome inactivation, stem cell maintenance, and differentiation61. This is most likely an important additional mechanism regulating VSELs quiescence. Next, we performed an analysis of key genes involved in DNA demethylation, including TET1, TET2, TET3, TDG, GADD45A, GADD45B, APOBEC3G, APOBEC3F, and SMUG1. Interestingly, we detected high expression of TDG, GADD45A, and GADD45B for cluster 1 of CD34+ VSELs with slightly lower expression of TET1, TET2, and TET3 (Fig. 7E). Most primitive cluster 5 of CD133+ VSELs was characterized by the expression of TDG, TET3, and GADD45A (Fig. 7F). Of note GADD45A and GADD45B (Growth Arrest and DNA Damage-Inducible) proteins have been implicated in active DNA demethylation, though their exact mechanism is not entirely understood. Lastly, we analyzed the transcription profile of cyclin-dependent kinases (CDKs), cyclins, and cell cycle checkpoints (ATM, ATR and CHEK1). For cluster 2 in the CD34+ lin-CD45-, we detected no expression of CCNE1, CDK1, CHEK1 and slight expression of CCND1, CCN1, ATM and ATR (Fig. 8A). A similar pattern of transcriptional profile we detected for cluster 5 in CD133+ lin-CD45- (Fig. 8B). Interestingly, we found high expression of CDK6 in cluster 2 of CD34+ lin-CD45-, and cluster 5 of CD133+ lin-CD45-, which is involved in the regulation of quiescence and proliferation of stem cells, stress-induced hematopoiesis, and HSPCs differentiation into myeloid and lymphoid lineages 62,63,64.

Fig. 8
figure 8

The expression of genes related to cell cycle processes. Violin plots of CCND1, CCNE1, CDK1, CDK2, CDK4, CDK6, ATM, ATR and CHEK1 expression in in CD34 + lin-CD45- (Panel A) and CD133 + lin-CD45- (Panel B).

Finally, we visualized the expression of transcripts associated with ecto, mezo- and endodermal differentiation (RUNX1, PPARG, KLF5, GATA1, SPI1, GATA6, NOTCH1, SOX17 and NODAL) (Fig. 9A and B). In primitive clusters of CD34+ lin-CD45- (cluster 1 and 2), the expression of all genes was observed, however mostly on the slight level. Only, in the case of SOX 17 and PPARG the pronounced expression associated was found (Fig. 9A). Similarly in the case of CD133+ lin-CD45-, the expression of most genes was at quite low levels observed in few cells, only RUNX1 was found to be expressed in primitive cluster (cluster 5) at an increased level (Fig. 9B).

Fig. 9
figure 9

The expression the transcripts associated with ecto, mezo- and endodermal differentiation. Violin plots of RUNX1, PPARG, KLF5, GATA1, SPI1, GATA6, NOTCH1, SOX17 and NODAL expression in CD34 + lin-CD45- (Panel A) and CD133 + lin-CD45- (Panel B).

Integrated analysis of two UCB-purified VSELs populations

Finally, we performed an integrated analysis of two datasets of CD34+ and CD133+ VSELs, which were merged and analyzed as “pseudobulk”. We compared the pseudobulk gene expression profiles and found unique subpopulations characteristic for CD34+VSELs (Fig. 10A). Although a strong linear relationship was found (R = 0.94; Fig. 10C), some elements differ between both cell populations at the subpopulation level. Therefore, by employing UMAP we then visualized clusters with the UMAP method and identified unique cells among CD34+VSELs as presented in clusters 8 and 9, which are missing in CD133+VSELS (Fig. 10B). The list of gene markers for every cluster of CD34+VSELs and CD133+VSELs is demonstrated in Table 3 in online. These unique clusters among CD34+ VSELs were previously shown in dark blue at the Fig. 1C.

Fig. 10
figure 10

Uniform manifold approximation and projection (UMAP) plots of VSELs clusters. The Seurat v5 integration of two datasets (CD34 and CD133 VSEL) for visualization and unsupervised clustering analysis was performed. Clusters overlay demonstrates similarities in expression profiles of almost all clusters of two datasets, except of cluster 8 and 9 identified for CD34 + lin-CD45- VSELs (Panels A and B), black dotted circle. Scatter plot of linear regression between CD34 and CD133 VSELs datasets showing the relationship between gene expression levels of both VSELs datasets (Panel C).

To address differences between CD133+ and CD34+ VSELs we focused on CD34+VSELs cluster 8 and identified top 10 genes that are CAV1, NRN1, H19, IFI27, GNG11, IFITM3, NNMT, ID3, BEX3 and IGF2. In the same cluster, we also noticed the expression of FUT9, SPATA18, FOXL1, POU5F1, DLL4, LRFN5, POTEF, PCDH17, GRIK2, and ZNF280A as well as several long-non coding RNAs (lncRNAs) e.g., LINC01058, AL136366.1, LINC02456, AC098934.4 and AL442636.1. Detailed analysis of cluster 8 in CD34+ VSELs population revealed also high expression of Sox18, Sox17 characteristics for specification of hemogenic endothelium (HE) (Table 3 in online). Moreover, we detected expression of HOX genes, e.g., HOXA13, HOXB7, HOXA11, HOXB7, and HOXA11, a family of transcription factors that are major regulators of development. We also found expression of CDKN1C (also known as P57kip2) that is a cyclin-dependent kinase inhibitor that functions as a negative regulator of cell proliferation through G1 phase cell cycle arrest. Moreover, we found in CD34+ VSELs datasets expression of PRXL2A, stress-induced reversible cell-cycle arrest require PRC2/PRC1-mediated control of mitophagy in germline stem cells, and SUMO2, involved in safeguarding pluripotency (Fig. 11A). Next, based on the significant expression of genes in CD34+VSELs cluster 8, involved in early developmental stages we performed GO analysis and identified genes involved in the transcriptional regulation of stem cells (Fig. 11B), including EPAS1, POU5F1, Nanog, Sox2. Moreover, Reactome analysis revealed the expression of genes involved in PGCs specification (e.g., BMP and STELLA) (Fig. 11A). The results are consistent with those obtained from unintegrated analysis. Of note, our scRNA-seq revealed expression of ZNF706 that as reported is a transcription KLF4 repressor involved in the proliferation of ES cells 65.

Fig. 11
figure 11

Over-representation of molecular pathways based on DEGs found in cluster 8 in CD34 + lin-CD45- VSELs. Enrichment analysis of transcriptional network of pluripotent stem cells, primordial germ cell specification, SUMO, Wnt, BMP and NOTCH signalling (Panel A and B); RNA processing, responses to external stimuli and cell cycle (Panel C); innate immunity, purinergic signalling and virus-host interaction (Panel D).

We also detected high expression of several genes involved in the repression of the cell cycle, response to external stress, and RNA processing (Fig. 11C). CD34+ VSELs data sets were characterized by the expression of several genes involved in hematopoietic specification LMO2, S100A6, SNRPE, a splicing factor switch controls hematopoietic lineage specification of pluripotent stem cells, and GBP4, that regulates transitions in lineage specification and gene regulatory networks in hematopoietic stem/progenitor cells. A subpopulation of CD34+ VSELs also express histone variants H3.3 (H3F3B and H3F3A), that maintains adult hematopoietic stem cell homeostasis by enforcing chromatin adaptability66 (Table 3 in online).

The high expression of several complement cascade genes (C5, C3, C5aR1, C5aR2, C3aR1), inflammasome (Nlrp3), cytokines (IL1β and IL18) might point to a role of complement in stem cell development67 and its role in surveillance to danger associated molecules68,69,70 (Fig. 11D and manuscript in preparation). Moreover, our scRNA-seq analysis of CD34+ VSELs data sets detected expression of RPS2, RPS3, RPS3A, RPS5-8, HNRNPA1, EEF1A1, CAV1, UBA52, and SUMO1 which confirms our hypothesis that SARS-Cov19 can infect this primitive population of stem cells (Fig. 11D). The second expression pattern “purinergic signaling” constituted P2X and P2Y related genes that were predominantly found in the “cellular responses to stress” cluster and “innate immune system”. Genes with this signature included CD46, CD55, P2RX4, and P2RX7 (Fig. 11D and manuscript in preparation). This expression of purinergic receptors confirms the role of purinergic signaling in stem cell development71,72. Moreover, since VSELs could be specified into mesenchymal stem cells (MSCs) 4 and endothelial progenitor cells (EPCs)9,34, we detected the expression of several genes involved in MSCs fate via HIF1a and mTOR signaling (DDIT4) and EPCs specification (PECAM1; CD31). Moreover, since VSELs may become specified into several types of tissue committed stem cells it is not surprising that we noticed expression of EH domain-containing protein 2 (EHD2), critical gene early stage of adipose-tissue proliferation that is a dynamin-related ATPase influencing several cellular processes, including membrane recycling, caveolae dynamics, and lipid metabolism. We also noticed expression of TBX3 (Table 3 in online) responsible for the specification of neuroepithelial cells. What is also interesting our work revealed the expression of TM4SF18, transmembrane 4 L six family 1 (TM4SF1) member that could be employed for prospective isolation of primitive cells from adult tissues (manuscript in preparation).

Finally, we found CD133+ VSELs that revealed the presence of two small clusters – subpopulation of cluster 8 and 10 (Table 3 in online) and we found again H19 (avg_log2FC = 12.44; adj. p value = 9.82E-10), CCN2 (avg_log2FC = 10.49; adj. p value = 9.82E-10), MOX2 (avg_log2FC = 8.84; adj. p value = 0.0035), IGF2 ((avg_log2FC = 8.15; adj. p value = 0.001), NKD2, that is an inhibitor of WNT signaling (avg_log2FC = 7.45; adj. p value = 9.83E-10). Of note, while CCN2 encodes a protein that serves as a mitogen secreted by vascular endothelial cells73,74, MEOX2 belongs to mesenchymal Homeobox 2 playing a role in the regulation of myogenesis75. Integrated UMAP clusters of CD133 + VSELs datasets visualized also a small subpopulation of cluster 8 with transcriptional signature of developmentally early genes, comprising of DLX6, a member of a homeobox transcription factor gene family playing a role in the forebrain and craniofacial development; paternally-expressed gene 3 PEG3; HOXA11, which encodes a transcription factor involved in proliferation, differentiation, and embryonic development; TCF15 involved in the early transcriptional regulation of mesoderm patterning; H19; IGF1; IGF-2, cyclin dependent kinase inhibitor 1C (CDKN1C); transcription factor involved in neuronal differentiation and/or phenotypic maintenance (ZFP2); NTS that encodes a common precursor for neuromedin N and neurotensin; and NRN1L encoding an extracellular protein that enhances both neurite growth and neuronal survival (Table 3 in online). For that cluster we also detected high expression of SIX4 (avg_log2FC = 9.65; adj. p value = 3.96E-15), that is a transcriptional regulator which can act as both a transcriptional repressor and activator by binding a DNA sequence on the target genes and is involved in processes including cell differentiation, migration, and survival76,77. Our scRNA-seq data also revealed the expression of sphingosine-1-phosphate receptor 2 (S1PR2) in subpopulation of CD133+ VSELS (Table 3 in online). We also detected similarly as in the case of CD34+ VSELS expression of several genes characteristic for immune system development (manuscript in preparation).