Microbial domestication and application
The foundational microbial community was cultivated through the domestication of aerobic sludge sourced from coking plants in a bottle shake, using phenol as the exclusive carbon source. Through nine generations of transfers, this microbial consortium underwent a comprehensive developmental journey, progressing through pivotal stages, including small-scale water tests (BF), gas tests (MUB), water treatment pilots (Water), and field trials in Nantong (Eng), as well as engineering applications in Shandong (CH) and Zhejiang (HMAJ). In the small-scale water test stage, utilizing coking wastewater from a Tianjin plant, the microbial community (BF) exhibited exceptional biofortification capabilities, reducing COD from 905 mg/L to 675 mg/L within a day, achieving a 25% phenol degradation rate, with formaldehyde undetected in the coking wastewater (Supplementary Fig. 1).
To further enhance the phenol and formaldehyde metabolic capabilities of the BF microbial community, an acclimation process was conducted using an industrial phenol-formaldehyde waste gas treatment system. The waste gas source contained detectable levels of phenol and formaldehyde, with the total concentration of these pollutants represented by the system’s inlet gas concentration. The acclimation process spanned 48 days, during which the inlet gas concentration exhibited significant fluctuations. For instance, on April 2, the inlet gas concentration varied between 68 mg/m³ to 177 mg/m³ at five different time points. Considerable variations were also observed between different periods, with the highest inlet gas concentrations recorded at 261 mg/m³ on April 11, 321 mg/m³ on April 13, and 57 mg/m³ on April 15 (Supplementary Fig. 2). These fluctuations in gas concentration, influenced by operational conditions, created a dynamic environment conducive to selecting microorganisms capable of adapting to drastic changes. As illustrated in Supplementary Fig. 2, over the initial 18 days, removal rates varied between 5.7% and 72%, stabilizing around 45% thereafter. At the sampling point, a remarkable 75% phenol degradation rate and 100% formaldehyde removal rate were achieved (Supplementary Fig. 2).
For the water treatment pilot, with the inlet COD ranging from 15880 mg/L to 36280 mg/L over a 10-day retention time, the microbial community (Water) demonstrated outstanding performance, achieving a 99.6% phenol degradation rate and a 95.0% formaldehyde removal rate on the sampling point (Supplementary Fig. 3). The Water microbial inoculum was subsequently applied in a field application in Nantong, addressing phenolic wastewater with an inlet COD ranging from 10660 to 40000 mg/L over a 15-day retention time. At the sampling point, an impressive 98.9% phenol degradation rate and a 99.0% formaldehyde removal rate were recorded, and the microbial community was denoted as Eng (Supplementary Fig. 4). The Eng microbial inoculum found application in the treatment of phenolic wastewater in CH and HMAJ. With COD inlets of approximately 40000 mg/L and 13500 mg/L, respectively, both sites exhibited removal rates surpassing 96.0%. Phenol degradation rates of 93.7% in CH and 97.8% in HMAJ, coupled with formaldehyde removal rates exceeding 99.8%, underscored the efficacy of the BCO treatment. While the water treatment effect in HMAJ is elucidated in Supplementary Fig. 5, data for CH is omitted due to considerations of manufacturer confidentiality. The detail information of the six phenolic pollution treatment system from which the six samples acquired was list in Supplementary Table 2 and Fig. 1. It can be seen that the inlet COD was increased along with the BF, MUB, Water, Eng stage (Fig. 1b), and the COD, phenol and formaldehyde removal efficiency increased sharply in the MUB stage and kept steady in the following stages (Fig. 1c), which showed a steady and high results for pollutant removal efficiency by the microbial community. The pollutant ratio analysis showed a decreasing trend of phenol and an increasing trend of formaldehyde and other pollutants ratio in the input wastewater of different stages (Fig. 1d), which demonstrated the complexity of actual industrial phenolic wastewater.
a The operation mode of six different wastewater treatment system. b The inlet COD, phenol and formaldehyde (HCHO) concentration of the input wastewater of the six system on the sampling timepoint. c The daily load COD, COD removal efficiency, phenol removal efficiency, formaldehyde removal efficiency of the wastewater system on the sampling timepoint during the stable system operation. d The ratio of phenol (PHENOL_RA_FOR_COD), formaldehyde (HCHO_RA_FOR_COD) and other pollutes (OTHER_POLLUTES_RA_FOR_COD) to the total pollutes in the wastewater. Since each system operates independently and without duplications, the data in the figure are the parameters on the day of the sampling, rather than the average of the three replicates.
Microbial community composition and diversity variation
PCoA analysis based on 16S rRNA V4 OTU sequences indicated that three clusters of similar microbial communities occurred across the six stages. Samples of BF and MUB clustered together, samples of Water clustered separate, and samples of Eng, CH and HMAJ clustered together (Fig. 2a). The ANOSIM and PERMANOVA test of communities from the three clusters showed significant difference under weighted Unifrac distance metric (ANOSIM, r = 0.8814, P = 0.001; ADONIS, R2 = 68.4%, P = 0.001). According to the β-diversity variation, the community was classified as three improving stages: Phase1 including the water and gas domestication stages (BF; MUB), Phase2 including the water pilot sale test stage (Water), Phase3 including the application stages (Eng; CH; HMAJ). The community α-diversity showed instability in Phase1 and Phase2 with relatively lower diversity, and in the application stage of Phase3 it kept steadily higher diversity (Fig. 2b). The community composition in BF and MUB samples were mainly constituted with Gammaproteobacteria, in Water samples the Betaproteobacteria was the dominant taxa. Interestingly, community composition in Phase3 showed more homogeneity with Gammaproteobacteria, Alphaproteobacteria, Betaproteobacteria, Firmicutes, Euryarchaeota, and Deltaproteobacteria as the dominant taxa (Fig. 2c). Community composition in genus level showed that members of Enterobacteriaceae and Acinetobacter accounted for 38.77% and 28.89% respectively in Phase1 stage, Comamonas accounted for 64.94% in Phase2 stage, while in Phase3 the genera were more evenly distributed with most of the dominant members (Desulfomicrobium, Methanolobus, Acetobacterium, Novosphingobium, Desulfovibrio, Pseudomonas, Paracoccus, and members of Pseudomonadaceae, Xanthobacteraceae, Rhodocyclaceae, SBR1031, Rhizobiales, Methylophilaceae, Rhizobiaceae, Ruminococcaceae, and Enterobacteriaceae) accounted for 1–5% each alone, except for Hyphomicrobium that accounted for 10.07% and ASSO-13 that accounted for 11.20% in Phase3 (Fig. 1d).
a PCoA results of the microbial community from the six stages. b The α-diversity of the microbial community in the six stages. c Heatmap of the community composition of different samples in the six stages on phylum level (Taxa of Proteobacteria were grouped at the class level). d Bubble diagram of the taxonomic distribution among the three phases during the six stages (Taxa with relative abundance above than 2% in at least one sample). Phase1 include BF and MUB stages, Phase2 include Water stage, Phase3 include Eng, CH and HMAJ stages. The circle size represents the average relative abundance of the corresponding taxa in samples of each phase.
Microbial community assembly process during different stages
In our exploration of the microbial community dynamics, wherein we scrutinized the shifts in composition and diversity across various stages based on the 16S rRNA V4 OTU sequences, a meticulous analysis of the factors influencing community structure—termed as the community assembly process—was conducted. We delved into the assembly mechanisms of distinct microbial groups, or ‘bins’, during different stages, aiming to discern the interplay between deterministic and stochastic processes in community assembly. Our investigation spanned three overarching phases (Phase1-3) and six specific stages (BF, MUB, Water, Eng, CH, HMAJ), during which 452 Operational Taxonomic Units (OTUs) were systematically classified into 44 phylogenetic bins (The OTU table and the “bin” information were list in sheet1 and sheet2 of Supplementary Data 1). The abundant OTU in each bin (sheet3 in Supplementary Data 1) was selected to construct the phylogenetic tree (Fig. 3b), and the relative abundance of the abundant OTU was used to trace the correlation between the bin and water traits (Fig. 3a) and for the exhibition of the bin relative abundance (Fig. 3c). The community assembly analysis was based on all the OTUs in each bin (Fig. 3d). Intriguingly, a majority of these bins exhibited a significant influence from stochastic processes. And relative importance of each process in governing the turnovers of whole community during different phases also showed the stochastic processes accounted for the most with dispersal limitation as the main affecting process (Fig. 4a). There were 27 bins significantly correlated with water traits, in which 18 bins showcased a discernible impact from deterministic processes, encompassing both heterogeneous and homogeneous selection (sheet4 in Supplementary Data 1). The prevalent taxa within these 18 bins, correlating significantly with water traits and influenced by deterministic processes, are detailed shown in Supplementary Fig. 6. The relative relevance of different community assembly process during the three phases were shown in Fig. 4a, which also showed the dominant influence of stochastic processes among different phases. In the view of each bin, the quantification of bins significantly influenced by the five assembly processes across the three phases is depicted in Fig. 4b. A striking observation is the absence of bins significantly affected by deterministic assembly processes during Phase3, where 33 bins were impacted by dispersal limitation and 4 bins by drift. Compared to Phase1 and Phase2, as well as the transitions between these phases, Phase3 exhibited a higher prevalence of bins influenced by stochastic assembly processes. Notably, four bins (bin10, bin18, bin30, bin40) were significantly affected by homogeneous selection during both Phase1 and Phase2, while two bins (bin15, bin19) were significantly influenced by heterogeneous selection during the transition from Phase2 to Phase3. Figure 4c provides detailed taxonomy information and the relative abundance of the OTUs within the six bins, featuring members of Methanolobus, Methanomethylovorans, Hydrogenophaga, Comamonadaceae, Cyanobacteria, Isosphaeraceae, Phycisphaerales, Mycobacterium, Leucobacter, and Corynebacterium lubricantis.
a The correlation analysis between water traits and relative abundance of each bin (n = 6). The color represents the correlation coefficients with positive value of green color and negative value of red color. The asterisk means the correlation between the bin (the most abundant OTU within each bin) and water trait is significant with * (0.01 < p < 0.05) and ** (p < 0.01). b The phylogenetic tree of each bin. The 16S rRNA V4 gene sequences of the most abundant OTU within each bin was used to construct the tree. The black circles on the branch represent the bootstrap value within the range from 0.016 to 1, with the larger the circle area the larger the bootstrap value. The # symbol indicates that the bin’s variation seemed to be simultaneously affected by deterministic process (heterogeneous and homogenous selection) and had significant correlation with water traits. c The relative abundance of each bin (the most abundant OTU within each bin) in different stage. d Different ecological mechanisms’ relative relevance in different domestication stage. HeS heterogeneous selection, HoS homogeneous selection, DL dispersal limitation, HD homogenizing dispersal, DR “drift” processes. The columns on the right are the color key values for subfigures (c, d).
a The relative relevance of different community assembly process during the three phases. The assembly processes were presented as deterministic process and stochastic process, and the five processes of HeS (heterogeneous selection), HoS (homogeneous selection), DL (dispersal limitation), HD (homogenizing dispersal) and DR (“drift” processes) which were belonged to them. Since Phase2 only contain one stage of Water sample, there did not form the turnover of community variation from one to another stage. So, Phase2 was not listed alone. b Number of bins with significant P-value under different ecological mechanisms during different phases. c Taxa abundance in homogenous or heterogeneous selection-controlled bins with the top5 most abundant species showed.
The community assembly process during the six scenarios (BF、MUB、Water、Eng、CH、HMAJ) showed more deterministic progresses (Hes and Hos) during BF to MUB (5.21%), MUB to Water (2.51%), Water to Eng (2.58%) than that during Eng to HMAJ (0.54%) and Eng to CH (1.88%). Correspondingly, more stochastic processes during Eng to HMAJ, Eng to CH than that during BF to MUB, MUB to Water, Water to Eng (Supplementary Fig. 7A, B). The heat map in Supplementary Fig. 7 showed the bins’ variation which were significantly affected by heterogeneous selection (Supplementary Fig. 7C) or homogenous selection (Supplementary Fig. 7D), and the relevance of their contribution to the corresponding bins during different stages. Hyphomicrobium (Bin44, Water to Eng) were the dominant taxa in the bins which were most greatly affected by homogenous selection process (relative relevance >0.7) during the specific different stages.
Microbial co-occurrence patterns under different phases
Building upon the former stage analysis and the microbial community assembly investigation, our attention now turns to the co-occurrence networks, delineating a distinction between samples from Phases1 and Phase2 (Phase1-2), denoting a domestication process, and samples from Phase3, representing an application process. The construction of these networks, based on correlation connections, revealed intriguing characteristics when compared to random networks. The networks demonstrated heightened clustering coefficients and increased network modularity, indicative of “small-world” features and modular architecture, as outlined in Supplementary Table 3. However, a nuanced difference surfaced in the network structures between Phase1-2 and Phase3. The Phase1-2 network displayed more pronounced scale-free characteristics (power-law: R2 = 0.2472, Supplementary Table 3), while the Phase3 network exhibited a greater degree of randomness (power-law: R2 = 0.0004, Supplementary Table 3) than its Phase1-2 counterpart.
Table 1 sheds light on the substantial increase in network size during Phase3, boasting 449 nodes compared to the 308 nodes in the Phase1-2 network. Moreover, the nodes in the Phase3 network were intricately interconnected, evident in the higher number of edges (26817), average degree (119.4521), and connectance (0.267) as opposed to the Phase1-2 network (2451, 15.916, 0.052). The nature of interactions within these networks is a focal point. The Phase1-2 network predominantly exhibited cooperative co-occurrence edges (95.7%), underscoring a synergistic pattern. In contrast, the Phase3 network showcased a significant increase in competitive (negative ratio) correlations (43.4%) compared to the modest 4.2% in Phase1-2 (Table 1).
The intricate modular organization of microbial networks during domestication (Phases1-2) and application (Phase3) unfolds a nuanced narrative. In the Phase1-2 network, a remarkable 41 modules were discerned, with the top 8 modules collectively commanding 78.24% of the network. Transitioning to Phase3, the network’s architecture evolved into 15 modules, with the top 8 modules now capturing a substantial 98.45% of the network’s complexity (refer to Fig. 5a, c). This modular transition was accompanied by distinct shifts in primary modules’ microbial composition, with Alphaproteobacteria, Betaproteobacteria, and Gammaproteobacteria dominating in Phase1-2, while Clostridia, Betaproteobacteria, and Gammaproteobacteria took center stage in Phase3 (see Fig. 5b, d).
The networks were built by computing correlations between OTUs in Phase1-2 and Phase3 communities. The nodes of the networks are colored to represent various ecological clusters (a and c), and prokaryotic classifications (b and d). The node sizes are determined by the degree of connectivity. e Shared species interactions between networks of Phase1-2 and Phase3. The red triangle indicates the nodes were module hubs selected in Phase1-2 or Phase3. The color of the nodes means the module it belonged to in Phase3 network. The taxa name of each node was the shorten form at the family level, and the corresponding full name of each node was list in sheet5 of Supplementary Data 1.
According to the criteria of Olesen et al.20, we sorted all species into four subcategories: peripherals, connectors, module hubs, and network hubs. The Zi-Pi scatter of all nodes from Phase1-2 and Phase3 networks was shown in Supplementary Fig. 8. In the two networks, there was no node falling into network hub (supergeneralists) category that acted as both module hub and connector in networks. The majority of nodes were peripherals (specialists). In Phase1-2 network, one node, derived from Unassigned_OPB56 (denovo9938) were categorized as module hub that was particularly strong interdependent with many nodes in their own modules; one node, derived from Clostridiaceae (denovo4520) were categorized as connector which “glues” modules together and was thus important to network coherence (Table 2, Supplementary Fig. 8A). In Phase3 network, 9 nodes were categorized as module hubs (Table 2, Supplementary Fig. 8B). Noteworthy is the composition of these module hub-associated OTUs, where 9 out of 11 exhibited relatively low abundance (< 0.1%), with the remaining 2 OTUs showcasing modest abundance (< 0.2%). These module hub-associated OTUs aligned with bins primarily influenced by stochastic processes during their respective phases, as detailed in Table 2.
The exploration of shared interactions between Phase1-2 and Phase3 networks unveiled a network of 922 species interactions, characterized by a remarkable bias – merely 2 interactions were negative, contrasting with a substantial 920 positive interactions. Nodes within these shared interactions represented diverse families, including Xanthomonadaceae, Sphingomonadaceae, Hyphomicrobiaceae, Comamonadaceae, Ruminococcaceae, Rhodocyclaceae, Pseudomonadaceae, Methanosarcinaceae, Methylobacteriaceae, Methylocystaceae, Methylophilaceae, among others. These nodes were chiefly derived from the top 3 modules, emphasizing that closely related edges and nodes within shared interactions emanated from the same module, housing the highest concentration of module hubs (Fig. 5e). This insight underscores the persistent influence of core species and interactions, weaving a continuous thread through both the domestication and application stages, imparting stability and functionality to the microbial community dynamics.
Metagenome analysis of microbiome during different stages
In our metagenomic exploration, merging three replicates from distinct stages into singular samples provided a comprehensive perspective. Detailed information on metagenome data quality control, assembly, and gene prediction is outlined in the supplementary file (Supplementary Tables 4 – Table 6). Initial generation of 58,124,653,480 bp original sequences culminated in 47,804,782,757 bp following sequence quality control, with high-quality sequences accounting for 96.78% of the total. Gene prediction revealed 2,712,704 gene sequences, and subsequent removal of redundant sequences (identity >95%) resulted in 1,899,150 unigenes. An unigene represents a distinct gene, denoting gene sequences obtained after deduplication, which sequence is unique and non-redundant, this term is frequently utilized in sequence assembly and gene prediction results21,22.
Taxonomy annotation successfully categorized 55% (1,048,575/1,899,150) of the total unigenes. Bacteria dominated the domain (98.76 ± 1.58%, mean relative abundance ± SD, n = 6), accompanied by smaller proportions of archaea (0.78 ± 1.39%), eukaryotes (0.15 ± 0.18%), and viruses (0.28 ± 0.22%) based on annotated unigenes. Dominant phyla included Proteobacteria (84.10 ± 9.30%), Bacteroidetes (4.27 ± 4.32%), Actinobacteria (3.34 ± 3.02%), Firmicutes (1.50 ± 1.44%), and Verrucomicrobia (1.05 ± 1.55%), each with a relative abundance ≥1%. Euryarchaeota (0.75 ± 1.37%) emerged as the predominant phylum in the archaea domain. A comparison of the dominant phyla between metagenome annotation and 16S rRNA V4 region amplicon sequencing annotation demonstrated close results (Supplementary Fig. 9), the Proteobacteria was the most abundant phylum in all the six samples based on both the two sequencing methods. While quite different results detected for the Firmicutes in CH, which may be caused by differences in sequencing methods and annotation bias. Identification of 10,374 KEGG Orthologs (KOs) from the unigenes showcased key pathways. KOs linked to carbohydrate metabolism (8.03 ± 1.76%), amino acid metabolism (8.02 ± 0.37%), membrane transport (6.01 ± 1.62%), energy metabolism (5.90 ± 0.83%), and cofactors and vitamins metabolism (5.35 ± 0.30%) dominated (with a relative abundance ≥5%) at KEGG level 2 pathway annotations. Unigenes associated with xenobiotics biodegradation and metabolism constituted 2.42 ± 0.52%.
In our investigation, we focused on genes associated with aromatic degradation, given the benzene ring structure in phenol. Modules such as phenylacetate degradation (M00878), catechol meta-cleavage (M00569), trans-cinnamate degradation (M00545), catechol ortho-cleavage (M00568), and benzene degradation (M00548) exhibited higher abundance compared to other aromatic degradation modules (Supplementary Fig. 10). Notably, the modules of catechol meta-cleavage and catechol ortho-cleavage represent essential pathways for phenol degradation. A closer examination of the KOs related to phenol degradation revealed their comprehensive presence in all six stages, except for K14727 (3-oxoadipate enol-lactonase/4-carboxymuconolactone decarboxylase [EC:3.1.1.24 4.1.1.44]), which was deficient in the BF sample (Supplementary Fig. 11).
Our exploration extended to genes associated with formaldehyde metabolism. Prokaryotic microorganisms utilize five pathways—tetrahydrofolic acid pathway (H4F), tetrahydromethanate pathway (H4MPT), glutathione pathway (GSH/MySH), ribulose monophosphate (RuMP) pathway, and serine pathway—for the assimilation, dissimilation, and detoxification of formaldehyde. Across different stages, our study detected genes related to these pathways, with varying degrees of completeness. Notably, in the application stage samples (Eng, CH, and HMAJ), the most complete pathways related to formaldehyde metabolism were identified (Supplementary Table 7).
Linking the stage-enriched taxonomic and functional properties
To unravel the intricate interplay between taxonomic and functional properties, we linked taxonomy to phenol and formaldehyde metabolism functional attributes across all samples based on the metagenome results. The clustering of phenol and formaldehyde metabolism-related KOs encompassed phenol 2-monooxygenase, catechol ortho-cleavage, catechol meta-cleavage, serine pathway, RuMP pathway, H4F pathway, H4MPT pathway, and GSH/MySH pathway.
Phenol 2-monooxygenase stands out as a pivotal aromatic cyclic oxygenase in the degradation of phenol. Initially, catechol is generated through the hydroxylation of phenol, followed by further degradation via ortho or meta oxidation23. In this study, the six KOs (Supplementary Fig. 12) encoding phenol/toluene 2-monooxygenase [EC:1.14.13.244 1.14.13.243] were universally detected in samples from distinct stages. Notably, Pseudomonas emerged as the primary contributor in the BF stage, Acinetobacter in MUB, Comamonas in water, and Thauera and Azoarcus in the application stage (Eng, CH, and HMAJ) (Supplementary Fig. 12). There were 5 genes involved in catechol ortho-cleavage (Supplementary Fig. 13), while, K14727 was absent in the BF sample but prevalent in other samples, with Pigmentiphaga and Cucumibacter being the major contributors. There were 12 genes participating in catechol meta-cleavage, among which, the primary contributor to K07104 was not among the top 20 taxa in the BF sample, with Hyphomicrobium emerging as the key contributor in the application stage samples (Eng, CH, and HMAJ) (Supplementary Fig. 14). Of all the KOs related to phenol degradation, only K14727 and K07104 exhibited a significant positive correlation with formaldehyde contents in the wastewater (Supplementary Fig. 15). K07104 encodes catechol 2,3-dioxygenase [EC:1.13.11.2] (C23O), catalyzing the meta-cleavage reaction of catechol and its derivatives. The enrichment of K07104 and K14727 in the application stage samples signifies an enhancement of phenol degradation function during the domestication process to adapt to the complex high-load sewage environment.
The RuMP pathway, illustrated in Fig. 6 with detailed information on enzymes and related KOs in Supplementary Fig. 16 and Supplementary Fig. 17, involves 3-hexulose-6-phosphate synthase (HPS) catalyzing the aldol condensation reaction of ribulose-5-phosphate with formaldehyde. Genes encoding HPS in the RuMP pathway include K08093 (hxlA) and K13812 (fae-hps). In the BF sample, only K08093 contributed by Enterobacter was detected. In contrast, in the application stage samples, both K08093 and K13812 were detected, with Methanomethylovorans listed as the top contributor in the Eng sample and Methanolobus in the CH and HMAJ samples (Fig. 6, Supplementary Fig. 16). Similar observations were founded for genes encoding 6-phospho-3-hexuloisomerase (PHI) in the RuMP pathway, including K08094 (hxlB) and K13831(hps-phi). In the BF sample, only K08094 contributed by Salmonella was detected. However, in water samples and the application stage samples, both K08094 and K13831 were detected, with Methanolobus and Methanomethylovorans being the main contributors in the application stage samples (Supplementary Fig. 16). The same patterns were observed for genes encoding EC:4.1.2.13 catalyzing the reaction of D-fructose 1,6-bisphosphate to glycerone phosphate, with the application stage samples exhibiting the most comprehensive pathways.
Different symbol represent different sample (○: BF, △: MUB, □:Water, ☆: Eng, ◇:CH, ⬠: HMAJ), different color represent different microbial contributor with white represents the deficiency of the corresponding enzyme gene in the sample. The reaction step those not marked means the enzyme gene was present in all the six samples.
The serine pathway, involving nine enzymes (Supplementary Fig. 18), showed distinct participation in various samples. Notably, Klebsiella in the BF sample contributed the most to K00600 (glycine hydroxymethyltransferase), K01595 (phosphoenolpyruvate carboxylase), and K00024 (malate dehydrogenase). In contrast, members of Hyphomicrobium, recognized as common methylotrophic strains with C(1) compound assimilation through the serine pathway24,25,26,27, were the major contributors to most enzymes in the serine pathway in the application stage samples (Eng, CH, HMAJ).
The pathways involving the cofactors H4F pathway (Supplementary Fig. 19), H4MPT pathway (Supplementary Fig. 20), and GSH/MySH pathway (Supplementary Fig. 21) were also explored. Noteworthy contributors and specific genes encoding enzymes in these pathways were identified, such as mtdA (K00300/K01491, methylenetetrahydrofolate / methylenetetrahydromethanopterin dehydrogenase), fchA (K01500, methenyltetrahydrofolate cyclohydrolase), fhs (K01938, formate – tetrahydrofolate ligase) in the H4F pathway, and fae (K10713, 5,6,7,8 – tetrahydromethanopterin hydrolyase), mtdB (K10714, methylene – tetrahydromethanopterin dehydrogenase), mch (K01499, methenyltetrahydromethanopterin cyclohydrolase), ftr (K00672, formylmethanofuran -tetrahydromethanopterin N – formyltransferase), and gene encoding EC1.2.7.12 (formylmethanofuran dehydrogenase) in the H4MPT pathway. Similarly, the GSH or MySH pathway showed the involvement of genes like gfa (K03396, S – (hydroxymethyl) glutathione synthase), frmA (K00121, S – (hydroxymethyl) glutathione dehydrogenase), frmB (K01070, S – formylglutathione hydrolase) and fdhA (K00148, glutathione-independent formaldehyde dehydrogenase). Without exception, genes involved in the synthesis of these enzymes were most diverse and abundant in the application stage.
In the overall analysis, a comprehensive set of genes related to 13 steps in the phenol and formaldehyde metabolic pathway was found to be absent in the domestication stage (BF, MUB, Water) (Fig. 6). However, in the application stage samples (Eng, CH, HMAJ), a more diverse array of genes was identified, ensuring the efficient and stable degradation function of the community in the application stage (Fig. 6).
Presence of phenol and formaldehyde metabolism modules in Metagenome assembled genomes (MAGs)
Through bin refinement and reassembly by metaGEM28, a total of 164 MAGs with over 50% estimated completeness were obtained for the assembled shotgun metagenomes of the six-stage samples (sheet1 in Supplementary Data 2). Figure 7 shows MAGs with a relative abundance greater than 1% in each sample, with labels indicating phenol metabolism (P), formaldehyde metabolism (H), and both phenol and formaldehyde metabolism (A). MAGs capable of metabolizing phenol, formaldehyde, or both were detected in all the six stages. Notably, in the application stages (Eng, CH, HMAJ), species with phenol-formaldehyde metabolism capabilities were the most prevalent.
The relative abundance of MAGs in metagenome sample of BF (a), MUB (b), Water (c), Eng (d), CH (e) and HMAJ (f). The red letters in the subfigure represent the presence of formaldehyde metabolic modules (H) and phenol degradation modules (P), or both formaldehyde metabolism and phenol degradation modules present (A) for each corresponding MAG. The MAGs with relative abundance above 1% in each sample were showed in the figure.
Among the 164 MAGs, five exhibited complete phenol-formaldehyde metabolism capabilities: MUB_bin.7 (Thauera humireducens), Water_bin.5 (Acinetobacter gerneri), Eng_bin.11 (Thauera humireducens), Eng_bin.19 (Pseudomonas_K), and HMAJ_bin.21 (Thauera sp029982755). 23 MAGs were capable of metabolizing either phenol or formaldehyde. The phenol-formaldehyde metabolism profiles of the 164 MAGs indicated that most MAGs contained only a few enzymes involved in the phenol or formaldehyde metabolism modules, forming a “plaque” composition mode (Supplementary Fig. 22; sheet2 in Supplementary Data 2), which combined could complete the full map of multi-metabolic pathways for phenol and formaldehyde metabolism.
In Eng sample, eight MAGs capable of metabolizing phenol or formaldehyde were detected, among which the Eng_bin.1 (Thauera sp029982755) and Eng_bin.11 (Thauera humireducens), were dominated, except for phenolic metabolism ability, the PHB synthesis genes were also detected in these two MAGs (sheet3 in Supplementary Data 2). MAGs that lack genes related in phenol and formaldehyde metabolism, such as Eng_bin.12 (Burkholderiaceae), Eng_bin.18 (Alphaproteobacteria), Eng_bin.20 (Hydrogenophaga), and Eng_bin.28 (Sphingobium), exhibited a high proportion of PHB synthesis (sheet3 in Supplementary Data 2).
In the HMAJ sample, Thauera sp029982755 was also dominated. In CH sample, CH_bin.7 (Hyphomicrobium_C), capable of formaldehyde metabolism were the most prevalent, CH_bin.37 (Parazoarcus communis), capable of phenol metabolism. Both these two MAGs exhibited a high proportion of PHB synthesis genes (sheet3 in supplementary Data 2). Furthermore, CH_bin.14 (Stappia sp900185725), which accounted for 5.64% in the CH stage, did not possess phenol and formaldehyde metabolism genes but did show a high proportion of PHB synthesis genes.
- SEO Powered Content & PR Distribution. Get Amplified Today.
- PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
- PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
- PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
- PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
- Source: https://www.nature.com/articles/s42003-024-07353-5