Single-cell division tracing and transcriptomics reveal cell types and differentiation paths in the regenerating lung – Nature Communications

Scgb1a1
+ cell loss activates several cell populations in the distal lung

To better understand which cells play an active role in lung epithelial regeneration, we generated a mouse model that enables precise depletion of specific epithelial cell types and the isolation and detection of dividing cells. Specifically, we crossed Scgb1a1-CreER x Rosa26R-DTA mice9, in which tamoxifen administration induces DTA expression and subsequent apoptosis in Scgb1a1-expressing cells, with CycB1-GFP transgenic mice, which allows the detection and isolation of actively dividing cells throughout the body (including lung) in a functional manner29 (Fig. 1a). Specifically, these mice constitutively express CycB1-GFP, a fusion protein of the N-terminal portion of cyclin B1 and eGFP that behaves like full-length cyclin B1 in terms of expression during the cell cycle, i.e. it is degraded in G0/G1 phases so that cells are GFP negative (GFP), and it is stable in S/G2/M stages rendering cells GFP positive (GFP+). Scgb1a1 (also known as CC10) is physiologically expressed by club cells and BASCs9. Scgb1a1-CreER x Rosa26R-DTA x CycB1-GFP (SRC) mice were injected with tamoxifen, and lungs were harvested before (day 0) and 2 and 3 days afterwards. Immunofluorescence (IF) analysis confirmed the progressive loss of club cells and revealed an increase in the number of GFP+ dividing cells over time (Fig. 1b). On day 2, the majority of GFP+ cells were located in the airway epithelium, consisting of club cells that were not yet damaged but replicated to repair the epithelium. On day 3, dividing GFP+ cells appeared in the airways, underlying connective tissue, and alveoli (Fig. 1b).

Fig. 1: Characterization of dividing cells after targeted depletion of Scgb1a1+ cells.
figure 1

a Schematic of Scgb1a1-CreER, Rosa26R-DTA, and CycB1-GFP transgenes in SRC mice. b IF staining of SRC mouse lungs with Cyp2f2 (white, club cells), GFP (green, dividing cells), and DAPI (blue, nuclei), before (day 0) and after 2 and 3 days of a single tamoxifen injection. GFP+ alveolar cells (arrow heads) and GFP+ spindle-like cells (arrows) can be observed on day 3. Aw: airway. Scale bar: 100 µm. Images are representative of five independent experiments with n = 3 animals analyzed per timepoint. c Scheme of cell sorting and scRNA-seq strategy. Created with BioRender.com. df UMAP embedding of scRNA-seq data from GFP+ cells sorted from lungs of SRC mice two days (n = 3 mice) and three days (n = 2 mice) after tamoxifen injection. d Cell cycle phase distribution. e Normalized expression of Epcam (epithelial cells) and Col1a1 (mesenchymal cells). f Cell type assignment. g Heatmap of the top ten upregulated genes across cell populations ranked by power (roc test). Scaling of expression was done after downsampling to 100 cells per cell type. DEGs mentioned in the text are in bold. h Percentage of dividing cell types in the distal lung at day 2 and 3 after tamoxifen injection.

To further characterize the heterogeneous population of proliferating cells that may contain stem or progenitor cells, we performed scRNA-seq of viable (DAPI) GFP+ cells isolated by fluorescent-activated cell sorting (FACS) from the distal lung of SRC mice at day 2 and 3 after tamoxifen injection (Fig. 1c). After excluding endothelial and hematopoietic cells, GFP+ cells represented 0.034% and 0.16% of total cells on day 2 and 3, respectively (Supplementary Fig. 1a). After removing low-quality cells (see Supplementary Data 1 for filtering parameters), in silico cell cycle analysis confirmed that the majority of sorted GFP+ cells were dividing, with 42% in G2/M and 44% in S phase (Fig. 1d), including both epithelial (Epcam+) and mesenchymal (Col1a1+) cells (Fig. 1e). Proliferating mesenchymal cells were identified as adventitial (Dcn+) and alveolar (Inmt+) fibroblasts (Fig. 1f, g). Five of the seven epithelial cell clusters could be allocated to distinct cell types based on the expression of known marker genes: AT2, basal, club, ciliated, and goblet cells (Fig. 1f, g and Supplementary Data 2). IF confirmed the transcriptomic results identifying proliferating club, goblet, basal, and AT2 cells, as well as adventitial and alveolar fibroblasts (Supplementary Fig. 1b). The small population of non-dividing ciliated cells (Fig. 1f) was isolated during FACS due to high autofluorescence. Basal cells represented 39% of dividing cells on day 2, while AT2 cells accounted for the majority (73 %) of dividing cells on day 3 (Fig. 1h). This was unexpected since AT2 cells are only known to function as progenitors for AT2 and AT1 cells upon alveoli injury2,19.

Among the differentially expressed genes (DEGs) in dividing club (club_div) cells compared to all other dividing cell types was the club cell marker Krt7 (Fig. 1g)30, but other canonical club cell markers such as Scgb3a2 or Cyp2f2 were not consistently expressed (Supplementary Fig. 1c). In fact, lower expression of these genes was associated with higher expression of AT2 marker genes including Sftpc, Napsa, and Lamp3, suggesting priming of club_div cells for differentiation into AT2 cells (Fig. 1g and Supplementary Fig. 1c). Club_div cells showed no similarity to reported club cell progenitor populations including lineage-negative epithelial progenitors (H2-K1+, Sox2+, Itgb4+), variant club cells (Scgb3a2high, Cyp2f2, Upk3a+), or hillock club cells (Krt13+, Krt4+) (Supplementary Fig. 1c)23,31,32. Moreover, although 37% of club_div cells at day 3 co-expressed Sftpc and Scgb1a1, a characteristic feature of BASCs, they did not consistently express additional BASC marker genes described previously (Supplementary Fig. 1d)13,33. Additionally, Sftpc+ Scgb1a1+ cells were transcriptionally similar to the other club_div cells (p_adj <0.05, min.pct=0.2, test.use= “MAST”), suggesting that either most BASCs were destroyed by DTA, or that they were not proliferating at the timepoints analyzed.

We found that goblet cells (Agr2+, Ltf+, Scgb3a1+, Lypd2+), which also expressed Scgb1a1 (Supplementary Fig. 1e, Supplementary Data 2), were dividing in the injured lung (Fig. 1f–h and Supplementary Fig. 1b). This was surprising, since goblet cells have never been shown to be capable of self-renewal19,34. Additionally, a cell type expressing both basal (Krt5, Krt15) and goblet (Agr2, Lypd2, Ltf) cell markers was among the dividing cells (Fig. 1f, g, n = 56 cells). We named this population basal-goblet_div, which might represent an intermediate cell state between goblet and basal cells, and we confirmed that it was not an artifact caused by doublets (Supplementary Fig. 1f).

Finally, the remaining seventh cluster of dividing epithelial cells expressed DTA, indicating the presence of lung cells that survived the expression of this highly potent toxin (Supplementary Fig. 1g). These DTA+ cells clustered separately from club_div and goblet_div cells and had a distinct gene signature, although they expressed secretory cell genes such as Scgb3a2 (Fig. 1f, g and Supplementary Data 2). Importantly, DTA+ cells did not have a higher percentage of mitochondrial genes or a lower number of counts or genes per cell, supporting the notion that these are viable cells (Supplementary Fig. 1h).

In summary, Scgb1a1+ cell loss induces widespread cellular activation with immediate proliferation of several populations in the lung rather than triggering a specific cell type. Among the dividing cells were goblet, basal-goblet, and AT2-primed club cell populations, which might represent progenitors or intermediate cell states in the distal lung.

Adventitial fibroblasts support lung epithelial regeneration

Several mesenchymal cell types are known to support epithelial regeneration upon lung injury24,26,27. Conversely, mesenchymal cells can also give rise to aberrant populations following lung injury, such as Axin2+ Pdgfrα myogenic progenitors27 and Cthrc1+ fibroblasts28, which contribute to fibrosis. To analyze the behavior of mesenchymal cells during loss of Scgb1a1+ cells, we performed scRNA-seq of actively dividing (GFP+) and non-dividing mesenchymal cells from the distal lung of SRC mice (Fig. 2a).

Fig. 2: Characterization of mesenchymal cell activation after epithelial injury in SRC mice.
figure 2

a Scheme of cell sorting and scRNA-seq strategy. Created with BioRender.com. b UMAP embedding showing mesenchymal cell types sorted from uninjured SRC mouse lungs (n = 2 mice). c Dot plot showing the average scaled expression and percentage of cells expressing cell-type specific marker genes across the mesenchymal cell populations in uninjured lungs. d UMAP embedding showing the distribution of GFP+-sorted dividing mesenchymal cells on day 2 (n = 12 mice) and day 3 (n = 4 mice), and non-dividing total cells on day 2 (n = 3 mice) and day 3 (n = 2 mice). The pie chart depicts the percentage of the mesenchymal cell types among the diving cells (light blue: alveolar fibroblasts, purple: adventitial fibroblasts, gray: peribronchial fibroblasts, orange: pericytes, green: SM cells). e Lung epithelial organoid cultures with adventitial fibroblasts or alveolar fibroblasts as supporting cells, taken after three weeks of culture. The inset in the upper image shows the morphology of one organoid magnified twice as much as in the main figure. Images are representative of two independent experiments. f Heatmap of differentially expressed ligands among the different mesenchymal cell types on day 0. Upregulated genes in adventitial fibroblasts are bold. g Average normalized expression of Bmp4, Tgfb3, and Mif on day 0, 2, and 3 in all mesenchymal populations. Asterisks denote significant differential expression on day 2 or 3 compared to day 0, when considering FC ≥ 1.5 and p_adj ≤ 0.05 (MAST test, all DEG had a p_adj < 0.001). adv_fibro adventitial fibroblasts, alv_fibro alveolar fibroblasts, peribr_fibro peribronchial fibroblasts, SM_cells smooth muscle cells.

In the uninjured lung, we identified adventitial fibroblasts, alveolar fibroblasts, peribronchial fibroblasts, pericytes, and smooth muscle cells (Fig. 2b, c and Supplementary Data 3), annotated according to Tsukui et al. 28. The types of mesenchymal cells remained unchanged upon injury (Supplementary Fig. 2a), and none of the post-injury mesenchymal cell populations showed increased fibrosis-associated genes, suggesting that this mouse model does not induce lung fibrosis (Supplementary Fig. 2b)28,35. Analysis of the GFP+ cells confirmed our previous observation (Fig. 1e–h) that alveolar and adventitial fibroblasts are the major proliferating mesenchymal cell types after Scgb1a1+ cell depletion, accounting for 60% and 37% of the total mesenchymal dividing cells, respectively (Fig. 2d). Since alveolar fibroblasts were approximately eight times more abundant than adventitial fibroblasts, the high proportion of adventitial fibroblasts in the total dividing cells suggests a higher activation and a possible functional role of these cells. To test this, we performed organoid cultures with epithelial progenitor cells (EPCAMhigh CD24dim)36 in co-culture with adventitial fibroblasts (PDGFRA+ CD34+ SCA-1+) or alveolar fibroblasts (CD34 SCA-1 PDGFRA+ NPNT+) (Supplementary Fig. 2c). After three to four weeks of culture, alveolar and bronchoalveolar organoids, distinguished by their morphology, were formed with adventitial fibroblasts as supporting cells (mean, 29 organoids per well (range: 24–35); organoid size, 0.125–1.075 mm in diameter), but not with alveolar fibroblasts (Fig. 2e), reinforcing our hypothesis.

To learn more about the supporting role of adventitial fibroblasts, we examined the expression of genes encoding cell-cell contact proteins, extracellular matrix proteins, and secreted ligands from the CellChat ligand-receptor database37. Adventitial fibroblasts differentially expressed Cd34, Dcn, Col1a2, and Col1a1 in uninjured lungs (Fig. 2f), which was confirmed on protein level by mass spectrometry for CD34 and DCN, while COL1A1 was also high in alveolar fibroblasts (Supplementary Fig. 2d). After injury, adventitial fibroblasts significantly downregulated mRNAs of the secreted ligands Bmp4 and Tgfb3 (Fig. 2g). These signaling proteins are known regulators of the lung epithelium. For example, inhibition of BMP signaling induces proliferation of lung epithelial cells in vitro and in vivo38 and TGF-β has a cytostatic effect on lung epithelial cells39. Adventitial fibroblasts also significantly upregulated several pro-inflammatory factors after Scgb1a1+ cell depletion (Fig. 2g and Supplementary Fig. 2e), corroborating the communication between mesenchymal and immune cells in the lung40. Among those was Mif (Fig. 2g), a pro-inflammatory factor that was reported to stimulate AT2 cell proliferation in vitro41,42.

In summary, our data indicate that adventitial fibroblasts support epithelial repair as they proliferate extensively, produce pro-inflammatory factors, downregulate secreted ligands known to be cytostatic on lung epithelial cells, and support epithelial organoid growth in culture.

Post-injury transcriptional changes imply immune activation by club, AT2, and ciliated cells

A deeper understanding of the molecular changes in epithelial cells during lung regeneration can aid the development of therapies for lung diseases. We therefore characterized the cellular and transcriptional responses by scRNA-seq in dividing and non-dividing epithelial cells, sorted as viable (DAPI) Epcam+ cells from distal lungs of SRC mice without treatment and two, three, and four days after tamoxifen administration (Fig. 3a). During sorting, the very abundant AT2 cell numbers were reduced by approximately 90%, while all GFP+ epithelial cells were enriched (Fig. 3a and Supplementary Fig. 3a). The steady post-injury increase of GFP+ dividing cells was confirmed by FACS and IF (Supplementary Fig. 3a, b).

Fig. 3: Transcriptional changes in epithelial lung cells after Scgb1a1+ cell depletion.
figure 3

scRNA-seq analysis was done using lung epithelial cells from SRC mice harvested on day 0 (n = 2 mice), day 2 (n = 7 mice), day 3 (n = 5 mice), and day 4 (n = 5). a Scheme of cell sorting and scRNA-seq strategy. Created with BioRender.com. b UMAP embedding showing epithelial cell type (upper panels) and dividing and non-dividing epithelial cells (lower panels) on the indicated days. c Gene set enrichment analysis (GSEA) of upregulated genes in club cells on day 2, 3, and 4 compared to day 0 (MAST test, FC > 2, p_adj <0.05). The terms mentioned in the text are shown in bold. d Heatmap of genes from the”leukocyte chemotaxis” cluster of the GSEA in c showing the expression in club cells. e GSEA of upregulated genes in ciliated cells on day 2, 3, and 4 compared to day 0 (MAST test, FC > 2, p_adj < 0.05). The terms mentioned in the text are shown in bold. f Heatmap of genes from the”response to virus” cluster of the GSEA in e showing the expression in ciliated cells. g UMAP embedding showing the expression of H2-Aa and H2-Ab1 on day 0 and day 4 (p_ adj <0.05 in ciliated cells with MAST test).

We identified AT1, AT2, basal, club, goblet, ciliated, and DTA+ cells (Fig. 3b, Supplementary Fig. 3c, d, and Supplementary Data 4). Basal, club, and goblet cells were again the main cell types dividing early after epithelial injury, while AT2 cells became the dominant dividing cell type as injury progressed (Fig. 3b). We then characterized the transcriptional changes between the cells isolated from uninjured lungs and the cells at the three time points after tamoxifen injection for each cell type (Supplementary Data 5). In goblet, basal, and AT2 cells, the top-upregulated DEGs were predominantly associated with cell cycle (Supplementary Fig. 3e and Supplementary Data 5). While basal cells increased expression of these genes as early as day 2, their expression in AT2 cells did not increase until day 3, which is in accordance with AT2 cells starting to divide later (Supplementary Fig. 3f and Supplementary Data 5). In addition, AT2 cells also displayed an upregulation of genes associated with immune response and inflammation, such as Il33, Chia1, Cd14, Lgasl3, Ly6e, and Ly6c1, suggesting a role of these cells in immune activation (Supplementary Fig. 3f).

Club cells showed the highest number of DEGs (FC > 2, p_adj <0.05) post-injury at all time points, reaching more than 300 DEGs on day 4 (Supplementary Fig. 3g). Upregulated DEGs were mainly related to cell cycle, cytoplasmic ribosomal proteins, protein folding, mRNA processing, and genes associated with leukocyte chemotaxis (e.g. Ccl20, Cxcl15, Cxcl5), the latter suggesting a role of club cells in immune activation (Fig. 3c, d and Supplementary Data 5).

Ciliated cells elevated the expression of interferon-stimulated genes (Fig. 3e, f and Supplementary Data 5), which are involved in viral response mechanisms and are crucial in driving and maintaining lung inflammation, for example, by promoting antigen presentation or stimulating cytokine production43. Ciliated cells also upregulated genes implicated in antigen processing and presentation, including MHC class II molecules (MHCII), which are normally found exclusively in professional antigen-presenting cells44. Unlike AT2 cells, which express high levels of MHCII independent of inflammatory stimuli (Fig. 3g)45, ciliated cells demonstrated very low expression of these molecules in homeostasis (Fig. 3g). Thus, the upregulation of MHCII molecules in ciliated cells after epithelial damage suggests that they might function as antigen-presenting cells upon lung epithelial injury.

Together, these data suggest that club, AT2, and ciliated cells may contribute to the initiation of an inflammatory response following lung injury.

The DTA+ cell transcriptome reveals a COVID-19 lung cell population

Genetic injury mouse models using inducible DTA expression have been widely used since they allow for rapid and controlled cell type-specific ablation46. We found that viable DTA+ cells can persist in the injured tissue and express a distinct gene expression signature (Fig. 1g), which may provide insights into mechanisms of cellular response to pathological conditions. To characterize these cells in more detail, we further enriched rare cell types by partially depleting the most abundant cell types, i.e. AT2 and ciliated cells, by ~90%, and used a more sensitive 3′ RNA sequencing protocol (Fig. 4a and Supplementary Fig. 4a). In addition, GFP+ epithelial cells were sorted in their totality in samples from day 0 and day 2 and added before sequencing (Fig. 4a).

Fig. 4: Characterization of DTA+ cells and comparison to lung cells from COVID-19 patients.
figure 4

a Scheme of cell sorting and scRNA-seq strategy. Created with BioRender.com. b UMAP embedding showing epithelial cell type assignment before (day 0, n = 8 mice) and after tamoxifen administration (day 2, n = 10 mice). c Violin plot with normalized DTA expression in all epithelial cells. d Heatmap of the top five upregulated genes across epithelial cell types on day 2 ranked by power (roc test). Scaling of expression was done after downsampling to 100 cells per cell type. Black boxes highlight the expression of the marker genes of the closest epithelial population in DTA+ cells. e GSEA of common DEGs in DTA+ populations in relation to all non-DTA+ cells (MAST test, p_adj <0.05). The terms mentioned in the text are shown in bold. f Heatmap of the activities of 14 pathways inferred with PROGENY for all epithelial populations on day 2. g UMAP embedding showing cell type assignment of COVID-19 and control human epithelial cells integrated with SRC mouse epithelial cells (day 0 and day 2). Arrows point to DTA+-like cells present in the COVID-19 patient samples. h Heatmap of the top 40 DEGs in DTA+-like cells from COVID-19 epithelial cells. i GSEA of upregulated genes in the DTA+-like cells (MAST test, p_adj < 0.05). The terms in bold depict the clusters that are mentioned in the text.

Unsupervised clustering revealed the same dividing and non-dividing epithelial cell types as described above, neuroendocrine cells, and five clusters of DTA+ cells (Fig. 4b, c and Supplementary Fig. 4b). Again, DTA+ cells did not have a higher percentage of mitochondrial genes or a lower number of counts or genes per cell, supporting the notion that these are viable cells (Supplementary Fig. 4c). Tamoxifen-treated Rosa26R-DTA x CycB1-GFP control mice lacked DTA+ cells (Supplementary Fig. 4d) and had unchanged lung morphology, number of GFP+ dividing cells, and number and distribution of club and AT2 cells compared with normal lung (Supplementary Fig. 4e), ruling out non-specific tamoxifen effects.

To determine the cells of origin or the closest relatives of the DTA+ populations, we calculated the correlation coefficient of gene expression between the different cell types on day 2 (Supplementary Fig. 4f). One DTA+ cell cluster showed high expression of AT2 canonical marker genes (Fig. 4d). We hypothesized that these cells, which we named DTA+_Sftpc+, were either derived from a Scgb1a1+ AT2 subpopulation9 and could therefore activate DTA expression (Supplementary Fig. 4g), or from club cells that differentiated into AT2 cells47. Two clusters transcriptionally resembled secretory cells and were accordingly named DTA+_club and DTA+_goblet, and DTA+ dividing cells included a mixture of cells resembling goblet and club cells, so we refer to them as DTA+ secretory dividing (DTA+_sec_div) (Fig. 4d and Supplementary Fig. 4f). The remaining cluster, which we termed DTA+_Foxj1+, expressed ciliated cell genes (Fig. 4d and Supplementary Fig. 4f) and likely evolved from club cells that differentiated into ciliated cells9. A complete list of the marker genes for each population at day 2 is provided in Supplementary Data 6.

Besides their cell type-specific gene signatures, the DTA+ populations shared gene expression patterns and had several marker genes in common (Fig. 4d). The co-upregulated genes showed a prominent enrichment in mRNA processing genes, including mRNA splicing-related genes (Fig. 4e and Supplementary Data 7). In addition, DTA+ populations displayed upregulation of genes involved in ribonucleoprotein complex biogenesis, cellular response to stress, induction of apoptosis, and downregulation of protein processing-related genes (Fig. 4e and Supplementary Data 7), which could be a consequence of DTA-mediated protein synthesis inhibition48. Curiously, all DTA+ populations, apart from DTA+_Foxj1+ cells, displayed increased levels of Cftr, a transmembrane conductance regulator that is mutated in cystic fibrosis (Supplementary Fig. 4h). This gene is highly expressed by ionocytes found in the proximal airways of human and mice11,31. However, DTA+ cells did not express other ionocyte marker genes, implying that these are different cell types (Supplementary Fig. 4h). Finally, DTA+ cells showed an enrichment in genes involved in pro-inflammatory TNF-alpha/NF-κB signaling and circadian rhythm, which could influence the inflammatory response49 (Fig. 4e and Supplementary Data 7). A complementary pathway activity analysis based on core pathway responsive genes confirmed that DTA+ populations had increased activation of several inflammatory pathways such as TNF, NF-κB, MAPK, and EGFR (Fig. 4f).

Considering these specific transcriptional changes in DTA+ cells, we hypothesized that they might mirror characteristics of damaged cells in the context of human lung diseases. Using gene signatures from DisGeNET50 and MSigDB51,52 databases for several human lung diseases, we found that DTA+ cells had particularly high enrichment for genes expressed in epithelial lung cells from COVID-19 patients53 (Supplementary Fig. 4i). Integration of our murine dataset with a human scRNA-seq dataset from a COVID-19 study4 revealed an epithelial cell population unique to COVID-19 samples, which we named DTA+-like cells (Fig. 4g and Supplementary Fig 4j–l). This cell population showed differential upregulation of genes involved in TNF and NF-κB signaling, supporting the similarity to DTA+ cells (Fig. 4h, i and Supplementary Data 8).

Together, epithelial cells expressing DTA can persist in the lung and share a distinct gene expression profile that exhibits activation of inflammatory signaling pathways and remarkable transcriptional similarity to a previously undescribed lung cell population in patients with COVID-19.

DTA+ cells signal to immune, epithelial, and mesenchymal cells

Inflammatory pathways, such as the ones described above, can be activated in epithelial cells by microbial components and lead to the initiation of an immune response through chemokine secretion54. Upregulated genes in the COVID-19 DTA+-like population were enriched for genes related to cytokine signaling in the immune system (Fig. 4i), and several chemokines were among the top 40 DEGs (Fig. 4h), suggesting that the identified COVID-19 cell population can contribute to the inflammatory response in these patients. To investigate this further, we used the murine DTA+ cells as a model for the human COVID-19 cell population and investigated their potential role in immune cell activation with focus on chemokines.

After Scgb1a1+ cell depletion, chemokine expression increased only slightly in regular epithelial cells, with the exception of Ccl20 in club cells and Cxcl17 in club_div and ciliated cells (Fig. 5a and Supplementary Fig. 5a). In contrast, DTA+ cells were a major source of chemokine production, and DTA+_club and DTA+_Foxj1+ even dedicated a large proportion of their total transcriptome to chemokine production (2% in DTA+_club at day 2 vs. 0.7% in club cells at day 0; 0.4% in DTA+_Foxj1+ at day 2 vs. 0.1% in ciliated cells at day 0) (Fig. 5a and Supplementary Fig. 5b). Among the chemokines significantly upregulated in at least one DTA+ cell type, but not in any regular epithelial cell type, were Ccl9, Ccl20, Cxcl1, Cxcl2, Cxcl3, Cxcl10, and Cxcl16 (Fig. 5b). Ccl9, Cxcl2, and Cxcl16 were shown to be expressed in immune cells during homeostasis, whereas Ccl20, Cxcl1, Cxcl3, and Cxcl10 are barely expressed in any cell type of the lung under healthy conditions (Supplementary Fig. 5c)55, suggesting that they are specifically produced under damage-inducing conditions. These DTA+-specific chemokines are known to recruit neutrophils (CXCL1, CXCL3), dendritic cells (CCL20), T cells (CCL20, CXCL10), and B cells (CCL20)56.

Fig. 5: Chemokine expression and crosstalk to mesenchymal and epithelial cells of DTA+ cells.
figure 5

a Chemokine expression in epithelial lung cells from SRC mice before (day 0) and after injury (day 2). Genes with a mean expression of at least three counts per 10,000 in any cell type are colored. b Expression of chemokines upregulated in DTA+ cells compared to other epithelial cells after injury (MAST test, FC > 2 and p_adj < 0.05 in any DTA+ cell type). Asterisks denote p_adj value for significant DEGs (***p_adj < 0.001, *p_adj = 0.01). The percentage of cells expressing the gene are plotted against the average expression per cell type. c Ranking of significant signaling pathways based on differences of overall information flow (i.e. summed interaction strengths) within the inferred networks on day 0 and day 2. Left: relative information flow. Middle: absolute information flow (log scale). Right: zoom in on the absolute information flow of pathways with log(information flow) <0.1 at any time point. Signaling pathways are colored by the type of interaction (yellow: secreted signaling, purple: ECM-receptor, green: cell-cell contact). dg Hierarchical plots showing the inferred communication network for EDN (d), IL6 (e), LIFR (f) and EGF (g) signaling. Circle sizes are proportional to the number of cells in each cell type, and line width corresponds to the interaction strength. The relative contributions of ligand-receptor pairs making up at least 1% of the pathway strength are shown on the right.

To characterize the interactions between epithelial and mesenchymal cells, we analyzed the ligand-receptor expression of all epithelial and mesenchymal cells in lungs of SRC mice at day 0 and day 2. After injury, new or increased signal transmission occurred in several pathways (Fig. 5c), mainly as a result of new ligand-receptor pairings involving DTA+ epithelial cells (Supplementary Fig. 5d, e). Specifically, signaling networks involved in the formation of tight junctions such as MARVELD, OCLN, JAM, and CLDN were activated in epithelial cells, with particularly high levels in DTA+ cell types (Fig. 5c and Supplementary Fig. 5f). Tight junctions are crucial during lung repair as they are responsible for sealing the epithelial barrier and preventing the entry of pathogens and the free diffusion of solutes into airspace57. Edn1 was also specifically expressed in DTA+ epithelial cells and was predicted to signal to alveolar fibroblasts, adventitial fibroblasts, smooth muscle cells, and pericytes (Fig. 5d and Supplementary Fig. 5f, g). EDN1 was shown to stimulate fibroblast replication, migration, and collagen synthesis58, suggesting a role of DTA+ cells in the activation of mesenchymal cells after epithelial injury. Finally, we found that DTA+ cells highly expressed several ligands that can signal to other epithelial cells and support lung regeneration (Fig. 5e–g and Supplementary Fig. 5f, g). Specifically, IL6, predicted to signal from DTA+_Sftpc+ cells to AT1, basal, goblet, and ciliated cells, was found to be crucial for lung repair after influenza-induced lung injury in mice59. LIFR pathway, which was previously shown to protect lung tissue during pneumonia in mice60, was predicted to be activated in AT2 and club cells by DTA+ cells through the expression of LIF. DTA+ cells also expressed high levels of Areg and Hbegf that can signal to basal cells through EGFR, previously shown to be essential for basal cell proliferation61.

Overall, DTA+ cells express specific chemokines and growth factors that can trigger an immune response and contribute to lung regeneration through epithelial and mesenchymal cell support.

Scgb1a1
+ and Sftpc
+ cell loss unveils Krt13
+ basal and Krt15
+ club cells

We next characterized the consequences of increased epithelial damage in the peripheral lung through additional injury of AT2 cells. Therefore, we crossed SRC mice with mice expressing tamoxifen-inducible Cre recombinase in AT2 cells under the Sftpc promoter25. The resulting Sftpc-CreER x Scgb1a1-CreER x Rosa26R-DTA x CycB1-GFP (SfSRC) mice (Fig. 6a) allow simultaneous damage of Scgb1a1+ and Sftpc+ cells after tamoxifen administration, thus of both bronchiolar and alveolar progenitor cells, and parallel analysis of GFP+ dividing cells. IF analysis showed that Scgb1a1+ and Sftpc+ cells progressively disappeared from day 2 to 14, and numerous dividing cells emerged in the airways, underlying tissue, and alveoli (Fig. 6b). Airway regeneration was not complete at day 28 based on cell morphology and Cyp2f2 club cell staining, despite only few dividing cells remaining, but was finally achieved at day 56 (Fig. 6b). Lung morphology on days 14, 28, and 56 showed no evidence of fibrosis (Fig. 6b), and Masson staining at day 14 confirmed the absence of collagen deposition (Supplementary Fig. 6a). A mixed inflammatory infiltrate composed mainly of lymphocytes and plasma cells and a few neutrophil granulocytes, with perivascular and peribronchial accentuation, was observed on day 4. This pattern became more pronounced on day 14, and, in addition, the lungs exhibited an accumulation of intra-alveolar macrophages in the form of foam cells, which was absent in controls (Supplementary Fig. 6b).

Fig. 6: Characterization of lung epithelial cells in the SfSRC mouse model.
figure 6

a Schematic of Sftpc-CreER, Scgb1a1-CreER, Rosa26R-DTA, and CycB1-GFP transgenes in SfSRC mice. b IF staining of SfSRC mouse lungs with Cyp2f2 (white; club cells), Lamp3 (red; AT2 cells), CycB1-GFP (green; dividing cells) and DAPI (blue; nuclei), on days 0, 2, 3, 14, 28 and 56 after tamoxifen injection. Aw: airway. Scale bar: 100 µm. Images are representative of three independent experiments with n = 3 animals analyzed per timepoint. c Scheme of cell sorting and scRNA-seq strategy. Created with BioRender.com. d UMAP embedding showing cell type assignment of epithelial cells before (day 0, n = 5 mice), and two (n = 9 mice) and three days (n = 6 mice) after tamoxifen injection. At day 2, basal_Krt13+ and club_Krt15+ cells are highlighted. e Dot plot showing normalized enrichment score (NES) and significance for published signatures of epithelial cell types in all SfSRC mouse epithelial populations (fgsea package91, all timepoints merged). DATP damage-associated transient progenitors, PATS pre-alveolar type-1 transitional cell state, RAS respiratory airway secretory, BC basal cell, TRB-SC terminal and respiratory bronchioles secretory cell, pre-TB-SC pre-terminal bronchiole secretory cell, AEP alveolar epithelial progenitor, BASC bronchioalveolar stem cell. Cell types described in the human lung are depicted in red. f Heatmap of the top 20 marker genes for basal_Krt13+ and club_Krt15+ cells, and expression of Krt15, Krt13, and Krt4, in basal and club cells of SfSRC mice (roc test, all three timepoints merged).

We analyzed lung epithelial cells from SfSCR mice with scRNA-seq before (day 0) and 2 and 3 days after tamoxifen injection (Fig. 6c). Cells were sorted analogously to the SRC model (Supplementary Fig. 4a) to enrich viable, rare EPCAM+ and dividing GFP+ cells (Supplementary Fig. 6c). We were able to distinguish all previously identified cell types and two additional small clusters (Fig. 6d, Supplementary Fig. 6d–f, and Supplementary Data 9). One cluster comprised Krt13+ basal cells (n = 88 cells; named basal_Krt13+) and the other Krt15+ club cells (n = 19 cells; named club_Krt15+) (Fig. 6d). Projection of cells from SfSRC mice onto cells from SRC mice and integration of all datasets generated in this study showed that basal_Krt13+ cells were present in both models, while club_Krt15+ cells were unique to the SfSRC model (Supplementary Fig. 6g, h). Since club_Krt15+ cells were found in the SfSRC model only at day 2 and we cannot distinguish from which of the nine mice used at this time point these cells originated, we cannot rule out the possibility that they were derived from a single mouse.

To systematically assess the similarity of basal_Krt13+ and club_Krt15+ cell populations with reported cell types, we performed fast gene set enrichment analysis (FGSEA) with gene signatures of canonical lung epithelial cell types and recently described lung epithelial cells7,11,13,14,31,33,55,62,63,64. Although Krt13+ basal cells were previously described in the mouse trachea as basal-secretory cells11, our basal_Krt13+ cells were transcriptionally distinct (Fig. 6e). Instead, they closely resembled SFTPC KRT5+ basal cells found in human airways7 (Fig. 6e). Club_Krt15+ cells were transcriptionally similar to mouse hillock club cells, despite the fact that they did not consistently express the main marker genes Krt4 and Krt13 (Fig. 6e, f). Since hillock club cells were found exclusively in the trachea31, and we removed trachea and main bronchi before performing scRNA-seq, our data indicate that hillock-like cells can be present in the distal airways.

Finally, we examined whether basal-goblet cells identified only in the separate analysis of GFP+ cells (Fig. 1) were also present in the other analyses. By projecting the cells from each scRNA-seq dataset onto the cells identified in Fig. 1, we found that basal-goblet cells (dividing and non-dividing) were indeed present in the SfSRC model dataset (Supplementary Fig. 6i) but were assigned to basal or goblet cells during clustering. Additionally, basal-goblet cells from the SfSRC model are located between basal and goblet cells in the UMAP embedding (Supplementary Fig. 6j), expressed both basal and goblet marker genes, and had a similar transcriptional profile to basal-goblet_div from Fig. 1 (Supplementary Fig. 6k), supporting the idea that basal-goblet cells represent an intermediate state between these two cell types.

Overall, the combined depletion of Scgb1a1+ and Sftpc+ cells revealed two cell types, basal_Krt13+ resembling human SFTPC KRT5+ basal cells, and club_Krt15+ with similarity to hillock club cells.

Trajectory modeling predicts goblet cell differentiation into basal and club cells

Finally, we explored the relations and predicted biological trajectories between the identified epithelial cell types in response to Scgb1a1+ and Sftpc+ cell loss. To get a better understanding of the data topology in the SfSRC model, we visualized the cells before and after injury in a force-directed graph (Fig. 7a) and summarized the connectivities with partition-based graph abstraction (PAGA) (Fig. 7b). The edges in the PAGA graph represent possible differentiation paths, which we analyzed using diffusion maps of different cell type subsets to obtain finer resolution.

Fig. 7: Differentiation trajectories between epithelial lung cells in SfSRC mice.
figure 7

a Force-directed graph of single cells with edges to the 10 nearest neighbors. b PAGA graph summarizing the connectivities between cell types. The thickness of the lines represents connectivity between cell types. c Diffusion map of basal, goblet and club cells with pseudotime trajectory colored by cell type. d Diffusion map of basal and goblet cells with RNA velocity vectors indicating differentiation of goblet into basal cells. e Diffusion map of basal and goblet cells with pseudotime of cells selected to characterize the goblet to basal cell trajectory in f. f Smoothed expression heatmap of the top 1000 altered genes along the differentiation trajectory from goblet to basal cells. The names of the top five genes in each cluster are annotated and transcription factors are indicated in orange. All transcription factors of each cluster are listed on the right. g Normalized expression along the goblet to basal cell pseudotime trajectory for exemplary genes. The black line is the smoothed expression using local polynomial regression fitting, with the 95% confidence interval shown in gray. h Diffusion map of club, AT2, and DTA+_Sftpc+ cells with pseudotime trajectory colored by cell type. i Diffusion map of club and AT2 cells with RNA velocity vectors indicating that dividing club cells give rise to club cells and AT2 cells. j Diffusion map of club, AT2, and DTA+_Sftpc+ cells with pseudotime trajectory colored by pseudotime of the cells selected to characterize the club to AT2 cell trajectory in k. k Smoothed expression heatmap of the top 1000 altered genes along the differentiation trajectory from club to AT2 cells. l Normalized expression along the club to AT2 cell pseudotime trajectory for exemplary genes. Smoothed expression and confidence interval are shown as in g. Figures were calculated on merged data from all timepoints (day 0, 2, and 3). m Proposed model of differentiation routes (straight arrows) and self-renewal capacities (curved arrows) of lung epithelial cells based on data and mouse models in this study. Created with BioRender.com. PT pseudotime, SE scaled expression.

According to the basic lineage model of the lung epithelium, basal cells are considered to be progenitors of club cells, which in turn give rise to goblet and ciliated cells34. However, in the PAGA map, basal cells were strongly connected to goblet cells and not to club cells (Fig. 7b). A diffusion map restricted to basal, goblet, and club cells confirmed that goblet cells serve as bridge between basal and club cells (Fig. 7c). The directionality inferred by RNA velocity even suggests that goblet cells give rise to basal cells (Fig. 7d and Supplementary Fig. 7a), which is in accordance with a recently described dedifferentiation trajectory from goblet to basal cells following bleomycin lung injury65. Basal-goblet cells, identified by projecting SfSRC cells onto GFP+ SRC cells (Supplementary Fig. 6i), were distributed between goblet and basal cells in the diffusion map, reinforcing that they are an intermediate cell type between goblet and basal cells (Supplementary Fig. 7b). We modeled a pseudotime trajectory from the tip of the goblet cell cluster and calculated DEGs along the goblet-basal and goblet-club axes separately. The transition from goblet to basal cells is primed by a decrease of the goblet cell markers Muc5b, Bpifb1, Scgb3a1, Scgb3a2, and Bpifa1, and an increase of the basal cell markers Aqp3, Krt15, Krt5, and Dcn (Fig. 7e–g and Supplementary Data 10).

Next, we followed the goblet cell trajectory to club cells (Supplementary Fig. 7c–f). The key transcription factors required for goblet cell differentiation and mucus production Spdef and Foxa366 are among the first downregulated genes, followed by goblet cell markers including Agr2, Bpfib1, Muc5b, and Scgb3a1 (Supplementary Fig. 7e, f and Supplementary Data 11). At the same time, the expression of club cell markers such as Cckar, Scgb1c1, and Sftpb increased. As in the goblet-basal cell trajectory, the differentiation between goblet and club cell clusters occurs without cell division (Fig. 7d and Supplementary Fig. 7c). PAGA and diffusion maps also showed a strong connection between DTA+_goblet and DTA+_club cells, suggesting that DTA+_goblet cells differentiate into club cells similar to uninjured goblet cells (Fig. 7b and Supplementary Fig. 7g).

Club cells are known to give rise to ciliated cells9. We also observed a trajectory from club to ciliated cells, which occurred almost exclusively through DTA+_club (Supplementary Fig. 7h, i), suggesting that differentiation into ciliated cells occurs preferentially through stressed club cells in our mouse model. While the expression of club cell markers is gradually reduced during the transition to DTA+_club and completely lost in DTA+_Foxj1+ cells, the expression of ciliated markers already started to increase in DTA+_club cells (Supplementary Fig. 7j–l and Supplementary Data 12).

Finally, club_div cells were predicted to give rise to both club and AT2 cells (Fig. 7h, i). As observed in SRC mice, club_div cells had decreased secretory cell markers, such as Scgb3a2 and Scgb1c1, and increased AT2 markers, such as Napsa and Lamp3 (Fig. 7j–l and Supplementary Data 13). Cldn18, a tight junction protein highly expressed in the alveolar epithelium, was also among the top DEGs upregulated in club_div, corroborating the club to AT2 cell transition (Fig. 7l).

Taken together, these data suggest secretory cells as the main epithelial progenitor cell population in the distal lung following Scgb1a1+ and Sftpc+ cell loss. In particular, goblet cells, previously thought to be terminally differentiated, are predicted to be progenitors for basal and club cells, and club_div cells for club and AT2 cells. Stressed (DTA+) club cells seem to be driven to differentiate into ciliated cells (Fig. 7m).