Search
Close this search box.

Direct comparison of mass cytometry and single-cell RNA sequencing of human peripheral blood mononuclear cells – Scientific Data

Human PBMCs were obtained from a donor, who had provided written informed consent (IRB 15328), at University of Virginia School of Medicine, Heart Center.

Split-sample preparation for scRNA-seq, CyToF, and flow cytometry

PBMCs were thawed in RPMI 1640 with 5% FBS, and incubated at 37 °C for 1 hr for recovery to ground state. 3 × 105 cells were set aside for scRNA-sequencing. The remaining cells (~7.5 × 106) were divided evenly for mass cytometry and flow cytometry. Cells allocated for scRNA-seq were strained and washed with PBS containing 0.4% BSA. Cell concentration was adjusted to ~500 cells/μL before proceeding with the 10x sequencing protocol.

Next, cells allocated for mass cytometry were fixed. Briefly, cells were incubated with cisplatin (10 µM in PBS) then quenched with cell staining medium (CSM; 0.5% BSA, 0.02% NaN3 in PBS). The cells were strained with a 100 micron nylon strainer before being fixed at room temperature for 10 minutes in 1.6% paraformaldehyde and subsequently stored at −80 °C in CSM. The sample was thawed and stained with metal-conjugated antibodies. Samples are first blocked with 10% donkey serum, stained with surface antibody metal-conjugated antibody cocktail (Table 1), then methanol permeabilized for 10 minutes at 4 °C before being stained for intracellular markers. After staining, samples are incubated with Iridium intercalator for DNA staining overnight at 4 °C before being analyzed on CyTOF mass cytometer (Standard Biotools). Normalization beads containing Lanthanum-139, Praseodymium-141, Terbium-159, Thulium-169, and Lutetium-175 are added to stained samples to perform normalization as previously described20. Stained samples and normalization bead mixtures are then filtered through a 40 micron filter and subsequently analyzed across several runs at a rate of ~250 cells per second on the mass cytometer. After measurement, samples are normalized20 and de-barcoded21 to individual FCS files. FCS files are gated for bead removal, debris clean up, and DNA intercalator.

Table 1 CyToF Panel.

Finally, cells allocated for flow cytometry were blocked with FcBlock (BD, Catalog No. 564219), before they were further divided evenly into six tubes. Primary antibody incubation of each tube was as follows: Tubes 1 and 2, no primary antibody; Tubes 3–6 anti-CD3 (Thermo Fisher, Catalog No. MHCD0300), anti-CD19 (Thermo Fisher, Catalog No. 14-0199-80), anti-CD56 (Thermo Fisher, Catalog No. 14-0567-80), anti-CD14 (Thermo Fisher, Catalog No. 14-0149-80) respectively. Tubes were placed on ice for 30 minutes and washed twice with FACS buffer. Cells were then incubated with secondary antibody (Thermo Fisher, Catalog No. A-11001) for an additional 30 minutes and washed twice before resuspension in FACS buffer. Flow cytometry was carried out on a BD LSR II flow cytometer and analyzed using FlowJo.

scRNA-seq data processing

Quality control filtering, normalization, clustering, and differential gene expression analysis was performed using Scanpy22 (version 1.8.2). Genes were excluded if they were detected in less than 3 cells; cells were excluded if their mitochondrial gene content exceeded 10% of their total reads or if they had fewer than 200 unique genes in order remove data from any prematurely lysed cells or from ambient RNA. Thresholds were chosen based on manually detecting steep changes in corresponding distributions, aligning with currently accepted practices22,23. Of note, varying these thresholds did not significantly change results of downstream analysis (Supplementary Fig. 1, Supplementary Table 1). Filtering resulted in 2653 cells and 15998 genes. The data was then normalized and log transformed and highly variable genes were identified (3004 genes). The data was then scaled and PCA was performed. Cells were clustered using the Leiden algorithm and visualized on a UMAP embedding (Fig. 1a). Further cell type classification was performed via SingleCellNet24, using sampled data from Zheng et al. as reference data (Fig. 1b). Finally, we annotated cell identity based on these classification results and expression of marker genes (Fig. 1c).

Fig. 1
figure 1

scRNA-seq analysis. (a) Clustering result of the scRNA-seq data. (b) SingleCellNet classification score heatmap. Reference data was taken from Zheng et al. (c) Select marker gene expression.

Clusters ‘0’ and ‘1’ expressed CD3D and CD4, and were subsequently annotated as CD4 T cells. Clusters ‘3’ and ‘4’, which expressed CD3D and CD8, classified strongly as CD8 cytotoxic T cells, and were thus annotated as CD8 T cells. Clusters ‘5.0’ and ‘5.1’ expressed CD19, classified strongly as B cells, and were annotated as B cells. Cluster ‘6’, which expressed NCAM1 and KLRD1, classified strongly as natural killer (NK) cells, and were annotated as such. Clusters ‘2.0’ and ‘2.1’ expressed CD14 and CD68, did not express FCGR3A, and classified strongly as monocytes. Both clusters were annotated as CD16- monocytes. Cluster ‘7’ showed markedly lower expression of CD14, high expression of FCGR3A and MS4A7, and classified as monocytes. This cluster was annotated as CD16+ monocyte. Cluster ‘2.2’ did not express CD14 or FCGR3A, but did express CD68, and was annotated as dendritic cells (DC). Finally, the smallest cluster, cluster ‘8’, expressed ‘PPBP’ and is likely a small group of platelets and was annotated as megakaryocyte-lineage (Mk). To note, these annotations can be further divided into finer sub-populations should users choose to refine the clustering, use a different reference dataset for classification, or widen the field of marker genes to analyze.

CyToF data processing

Gating to remove debris and subsequent arcsin normalization was done on Cytobank. Leiden clustering and UMAP visualization was performed using Scanpy (Fig. 2a). No other normalization or dimension reduction was performed, and cell annotation was based on marker expression (Fig. 2b,c).

Fig. 2
figure 2

mass cytometry analysis. (a) Clustering result of the mass cytometry data. (b) Average transformed expression of select markers in each cluster. (c) Full mass cytometry panel expression.

Briefly, clusters ‘0’, ‘1’, ‘9.0’, and ‘9.1’ were CD3+ CD4+, and were annotated as CD4 T cells. Clusters ‘2’, ‘5’, ‘6’, and ‘8.0’, were CD3+ and CD8a+, and were annotated as CD8 T cells. Clusters ‘4’ and ‘11’ were CD19+ CD20+ CD79b+ HLADR+, and were annotated as B cell. Cluster ‘14’ exhibited lower levels of CD19 and CD20, but was also annotated as B cell. Clusters ‘3.0’, ‘3,1’, ‘3.2’, and ‘3.4’ were CD56+ and were labeled as NK cells. Cluster ‘3.3’ were both CD56+ and CD3+, and were labeled as NKT cells. Cluster ‘7.0’ was CD14+ and CD16-, and was annotated as CD16- monocytes. Cluster ‘7.1’ and ‘12’ were HLADR+ and CD68+, and were annotated as DCs. Finally, we found isolated two populations in the CyToF data that we were unable to resolve or detect in the scRNA-seq population. First, Clusters ‘10’ and ‘13’ were CD3+ and both CD4- and CD8a-. We labeled these as double negative T cells (DN T cell). In contrast, cluster ‘15’ was CD3+ CD4+ CD8a+, and was labeled as double positive T cells (DP T cell). To note, based on our Ab panel, users of this dataset can further divide these broad annotations into finer subpopulations.

Quantifying differences in scRNA-seq and mass cytometry

Based on our cluster annotations, we quantified the percentage of each population in the scRNA-seq, mass cytometry, and flow cytometry data (Fig. 3a,b). Importantly, despite the split-sample nature of the three datasets, there was variation in the proportions of specific cell populations. Notably, while mass cytometry and flow cytometry largely agreed in percentage of T cells, scRNA-seq detected a lower percentage of the same population. As described above, this difference was further exacerbated by the DN and DP T cells that were not detected in the scRNA-seq data. In contrast, the scRNA-seq data exhibited a larger proportion of monocytes than both the mass cytometry and flow cytometry data. To note, we did not resolve a CD16+ monocyte population in the mass cytometry data. Finally, while the scRNA-seq and mass cytometry data exhibited a roughly equal proportion of NK cells, the flow cytometry data had a larger percentage of the same population. To note, differences in cell type percentages measured by scRNA-seq and cytometry have been previously reported in bone marrow mononuclear cells (BMMCs)25.

Fig. 3
figure 3

scRNA-seq and mass cytometry comparison. (a) Percentage of given cell types in scRNA-seq and mass cytometry data. (b) Percentage of given cell types in scRNA-seq, mass cytometry, and flow cytometry data. T cell percentage includes cells annotated as CD4 T, CD8 T, DN T, and DP T. Monocyte percentage includes CD16+ and CD16- monocytes. (c) Correlation of CyToF and scRNA-seq measurements in each cell type.

To broadly estimate the correlation between scRNA-seq and CyToF measurements, we examined the normalized and log-transformed mass cytometry measurements and compared them to the normalized and log-transformed expression of the corresponding genes in the scRNA-seq data across the different cell types (Fig. 3c). Overall, the correlation between the mass cytometry and scRNA-seq measurements was relatively weak (r2 = 0.47–0.66). Taken together, these brief analyses suggest an imprecise concordance between scRNA-seq and mass cytometry measurements. This finding may have broader impacts including suggesting the need for careful consideration in applications such as the identification of rare populations and cell states, which may be obscured using one data modality over another.