Search
Close this search box.

Novel enzymes for biodegradation of polycyclic aromatic hydrocarbons identified by metagenomics and functional analysis in short-term soil microcosm experiments – Scientific Reports

Metagenomic hits from the dioxygenase and catalase/peroxidase family of enzymes

Table 2 represents a list of hits from both the dioxygenases and catalase-peroxidase families.

Table 2 General properties and bioinformatic predictions of metagenomic hit enzymes.

For the selection of hits that will be used in further detailed experimental work, we set a number of criteria. Four criteria were considered, two of these were based on sequence analysis and two other criteria were based on bioinformatic predictions. First, only full-length sequences were selected, in which the representative domains and motifs were all present as compared to homologous sequences from the NCBI database. For this criterion, we also required that the metagenomic DNA sequence contain the STOP codon (shown as “end sign” in Table 2). Second, in order to identify enzymes with potential novelty in their amino acid sequence, we selected hit sequences that show considerable amino acid variations as compared to enzyme sequences already present in the NCBI database. In this criterion, we considered only metagenomic sequences showing less than 92% of sequence identity (match) with sequences already present in databases. The hits fulfilling these two criteria are all included in Table 2 (for hit sequences see Supplementary Table S7). The third criterion related to the crystallizability potential. Crystallizability potential was included in the criteria since it usually indicates a well-folded protein and may lead to high resolution structural insights in further studies. In this respect, we used the CrystalP2 software and removed those hits that were predicted to be non-crystallizable from the further investigations. For the fourth criterion, we required that the proteins could be expressed in E. coli, as based on bioinformatic expression prediction methods.

Both the ESPRESSO and MEMEX servers, which estimates protein expression predicted that all of our selected protein can be expressed in E. coli. Disorder predictions for the metagenomic hit sequences showed only short segments with appreciable disorder, potentially constituting flexible linker regions (see Supplementary Figure S1). We have identified PAH1_99, PAH1_105 and PAH6_39 as sequences fulfilling the four criteria. To assess the evolutionary relationships of these three proteins, we have constructed evolutionary trees using the NCBI blastp server (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Figure 2 shows these results. PAH1_99 and PAH6_39 show up as distinct branches, well separated from other parts in the evolutionary tree, arguing for some character of novelty. PAH1_105 shows closer relationship with already known sequences. In further details, we focused on sequence and structural alignments to select potential novel enzymes with preserved catalytic sites as compared to previously described representatives of the dioxygenase and catalase/peroxidase enzyme families.

Figure 2
figure 2

Phylogenetic trees of the newly identified proteins (highlighted by yellow shade). Protein sequence alignments and phylogenetic trees were generated using NCBI blastp (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Bar at the bottom of the figure provides a scale the amount of change (number of changes or ‘substitutions’ divided by the length of the sequence).

Structural analysis of the dioxygenase metagenomic hits

Figure 3A shows the alignment of the six full-length sequence hits from the dioxygenase enzyme family. Among these six sequences, only the sequences denoted as PAH1_99, PAH1_102 and PAH1_105 were predicted to be crystallizable. A close inspection of this alignment revealed an intriguing variation at a specific sequence position (highlighted in yellow on Fig. 3A). At this position, the protein possesses either glycine or alanine amino acid and according to the literature this variation has a significant effect on the substrate selectivity of the dioxygenase42. Ferraroni et al. created this mutation in the enzyme Pseudaminobacter salicylatoxidans salicylate 1,2-dioxygenase, which also belongs to the dioxygenase enzyme family. They found that if glycine is present at this position, then the enzyme can bind several substrates in the active site, while dioxygenases with alanine in this position oxidize only gentisate42.

Figure 3
figure 3

Dioxygenase sequence alignments. Panel A Amino acid sequences of the six dioxygenases which were found during metagenomic search. The amino acids important in substrate specificity were highlighted by yellow. Glycine in this position allows several substrates binding in the active site, while dioxygenases with alanine in this position oxidize only gentisate 42. Panel B Amino acid sequence alignment of the two selected dioxygenases (PAH1_99 and PAH1_105) and Pseudaminobacter salicylatoxidans salicylate 1,2-dioxygenase. Highly conserved amino acids based on literature42 are shown in red. Yellow color indicates the G/A point mutation which influences the enzyme substrate specificity. The flexible (loop) regions of the template were highlighted by purple, and the main differences in the three-dimensional structure between the template and the model were highlighted by ocher.

Our aim was to investigate one representative from both the alanine and the glycine containing variations. PAH1_105 is the single hit with a glycine residue in this critical position that was also predicted to be crystallizable, hence this sequence was selected for further characterizations. From among the two crystallizable hits with an alanine in the critical position, PAH1_99 and PAH1_102, we have selected PAH1_99 which shows more variability as seen in the sequence identity (match) parameter (cf Table 2) as compared to the sequences present in the database.

Next, we created a three-dimensional structural model of the two selected dioxygenases using the SWISS-MODEL server. In both cases the template identified in the SWISS-MODEL server was the same, namely Pseudaminobacter salicylatoxidans salicylate 1,2-dioxygenase (PDB ID: 3NW4). Global Model Quality Estimation (GQME) scores were 0.83 and 0.72 in case of PAH1_99 and PAH1_105, while the QMEAN scores were − 1.27 and − 1.47. Figure 3B shows the alignment for the two metagenomic hits and their three-dimensional template sequences: most residues strictly conserved in dioxygenase sequences based on the literature42 are also conserved in the two novel metagenomic hits (cf red background).

The three-dimensional fold predicted with the SWISS-MODEL server is shown in Fig. 4A (homotetramers and one subunit, respectively, PAH1_105: orange, PAH1_99: greencyan, template: yellow), while Fig. 4B presents a close up from the active site, indicating several differences in amino acid positions of the two metagenomic hits. These enzymes have a homo-tetrameric structure containing a catalytic Fe(II) ion coordinated by three histidine residues in the N-terminal region43. The predicted folds of the two metagenomic hits are identical with minor differences observed at one loop position (a flexible loop in the template sequence—shown on Figs. 3 and 4B).

Figure 4
figure 4

Three-dimensional models of the two dioxygenases (PAH1_99 and PAH1_105) built based on Pseudaminobacter salicylatoxidans salicylate 1,2-dioxygenase (PDB ID: 3NW4). Panel A The merged homotetrameric structures of the template (yellow), and the models of PAH1_99 (greencyan) and PAH1_105 (orange). Panel B A close up of a monomer with the active site. Red arrows point on the main differences in the three-dimensional structure between the template and the models (the sequences are highlighted by ocher on Fig. 2B). The flexible (loop) regions of the template were colored by purple. Panel C Amino acids which interact with the substrate gentisate based on Pseudaminobacter salicylatoxidans salicylate 1,2-dioxygenase three-dimensional structure (PDB ID: 3NW4). Colours: PAH1_99 (greencyan) and PAH1_105 (orange) with the substrate gentisate (2,5-dihydroxybenzoic acid) (magenta). The Fe(II) ion is indicated by a gray sphere. Polar contacts were labelled by black dashed lines. (First amino acid belongs to PAH1_99, the second to PAH1_105).

The wild-type Pseudaminobacter salicylatoxidans salicylate 1,2-dioxygenase protein sequence contains a Gly amino acid in the position 106. It was found that protein with G106A mutation oxidized only gentisate, while 1-hydroxy-2-naphthoate and salicylate were not converted. The amino acid residue Gly106 is located inside the enzyme active site cavity but does not directly interact with the substrates. In the case of the G106A mutation, based on the crystal structure of the complex, a different binding mode was observed for salicylate compared to the wild-type enzyme. The salicylate in the G106A variant coordinated to the catalytically active Fe(II) ion in an unusual and unproductive manner, since salicylate is unable to displace the hydrogen bond formed between Trp104 and Asp174 in the G106A variant. Presumably, such inefficient substrate binding may generally limit the substrate spectrum of wild type GDOs42.

Structural analysis of the catalase-peroxidase metagenomic hits

In case of catalase-peroxidases all of the metagenomic hits were predicted as crystallizable (see Table 2), so we have selected PAH6_39 which shows more variability as compared to the sequences present in the database. Figure 5A shows the alignment of the three full-length sequence hits from the catalase-peroxidase enzyme family. A three-dimensional structural model of the selected catalase-peroxidase was also built using the SWISS-MODEL server.

Figure 5
figure 5

Catalase-peroxidase sequence alignments. Panel A Multiple sequence alignment of the three catalase-peroxydase amino acid sequences which were found during metagenomic search. Panel B Amino acid sequence alignment of the selected catalase-peroxydase PAH6_39 and Synechococcus elongates catalase-peroxidase. Highly conserved amino acids based on literature44 labelled by red. The residues of the catalytic Met‐Tyr‐Trp adduct are highlighted in yellow. The flexible (loop) regions of the template were highlighted by purple, and the main differences in the three-dimensional structure between the template and the model were highlighted by ocher.

The template identified in the SWISS-MODEL server was the Synechococcus elongatus catalase-peroxidase KatG (PDB ID: 3WNU). Global Model Quality Estimation (GQME) score was 0.91 and the QMEAN score was –1.32. Figure 5B shows the alignment for the selected catalase-peroxidase and their three-dimensional template sequences: most residues strictly conserved in catalase-peroxidase sequences according to the literature44 are also conserved in the novel metagenomic hits (cf red background).

The three-dimensional fold predicted with the SWISS-MODEL server is shown in Fig. 6 (homodimers and one subunit, respectively, PAH6_39: blue, template: yellow). The predicted fold of the metagenomic hit is identical to the template with minor differences observed at one loop position (a flexible loop in the template sequence—show on Figs. 5 and 6B).

Figure 6
figure 6

Three-dimensional models of the catalase-peroxidase PAH6_39 built based on Synechococcus elongatus catalase-peroxidase (PDB ID: 3WNU). Panel A The merged homodimeric structure of the template and the model, where the PAH6_39 was labelled by blue and the Synechococcus elongatus catalase-peroxidase was labelled by yellow. Haems in the active sites shown in orange. Panel B A close-up of the monomers. The color scheme is same as in case of the homodimer. Na ions were labelled by gray spheres. The flexible (loop) regions of the template were colored by purple. Red arrows point on the main differences in the three-dimensional structure between the template and the models (the corresponding sequences are highlighted in ocher yellow on Fig. 5B).

Based on the bioinformatic analysis and the structural modelling, we have expressed and purified two dioxygenases (PAH1_99 and PAH1_105) and one catalase-peroxidase (PAH6_39). Following the experimental protocol details in the Methods section, we have obtained the following yields from 0.5 L medium: 59 mg, 42 mg and 49 mg, for the enzymes PAH1_99, PAH1_105 and PAH6_39 respectively. These enzyme preparations were used in the further experiments.

Performance of the novel enzymes in PAH degradation

In order to test the functionality of the novel enzymes in soil samples, we set up a series of microcosm experiments. Using soil samples spiked with known amounts of different PAH contaminants (12.5 mg/kg, 25.9 mg/kg, 46.2 and 52.0 mg/kg for naphthalene, phenanthrene, anthracene and pyrene, respectively, cf. Methods), the effect of the enzyme addition on the concentration of contaminants was determined following an incubation period of 7 days.

As shown on Fig. 7A, even in the absence of the added enzymes a considerable degradation of PAH compounds was observed in all of the soil microcosm experiments, suggesting the potential of an inherent microbial activity in the soils. Addition of the novel enzymes (PAH1_99, PAH1_105 and PAH6_39), predicted from metagenome searches and sequence alignments, led to further significant increase in PAH degradation only for naphthalene and phenanthrene. Anthracene and pyrene degradation was not increased by the novel enzymes in this setup.

Figure 7
figure 7

PAH degradation (Panel A) and microbiological activity (Panel B) in the soils treated with the novel enzymes. Data represent averages of three replicates and error bars are standard deviations. Red lines show the initial concentrations of the specific PAH compounds. Letters on the columns indicate significant differences (p < 0.05), data in columns with different letters are statistically different, data in columns with the same letters are not statistically different.

The suggestion for the inherent microbial activity on the soils was reinforced by the Biolog EcoPlate™ test (cf. Figure 7B) where the various metabolic patterns were estimated with the AWCD values45. Some significant increase in the AWCD values upon addition of the novel enzyme argue for the positive effect of the enzyme proteins on the microbes in the soil samples.

In these experiments, we could establish significant remediation activity of the novel enzymes for some, but not all PAHs tested (see also Table 3 for pollutant removal efficiency data in %, calculated from the results in Fig. 7A). It was also observed that enzyme protein addition increased the inherent microbial activity. Hence, it was not straightforward to decide whether the positive effect of the novel enzymes is due to their specific catalytic activities or to their potential to stimulate the inherent microbial activity as carbon and nitrogen sources.

Table 3 The average pollutant degradation efficiency of the different treatments compared to the initial pollutant concentrations determined by GC MS.

Towards further insights, we set up another set of experiments to investigate if combining enzyme addition with a simple inorganic oxidant molecule may increase the beneficial effects on PAH degradation. We selected CaO2 for these experiments since this compound compares preferably to other oxidizing agents in terms of stability and cost efficiency.

Despite the fact that according to other studies46,47 calcium peroxide proved to be extremely effective in removing petroleum hydrocarbons and polycyclic aromatic hydrocarbons from the soil, the results of our microcosm experiments displayed otherwise in some cases. Figure 8A shows that addition of CaO2 leads to preservation of the original spiked concentrations of the PAH pollutants. This effect is most probably due to destroying the soil-inherent microbial flora as the metabolic activity observed in the non-treated soil is fully erased upon addition of CaO2 (Fig. 8B).

Figure 8
figure 8

PAH degradation (Panel A) and microbiological activity (Panel B) in the soils treated with CaO2 and/or enzymes. Data represent averages of three replicates and error bars are standard deviations. Red lines show the initial concentrations of the specific PAH compounds. Letters on the columns indicate significant differences (p < 0.05).

However, a combination of the inorganic oxidant CaO2 with the novel enzymes displayed increased successful degradation of the more resistant PAHs anthracene and pyrene compounds tested in our microcosm experiments (Fig. 8A). Anthracene concentrations drop by approx. 57–70%, while pyrene concentrations are decreased by approx. 56–66% upon the combined treatment of the soils with CaO2 and the novel enzymes (see also Table 3 for pollutant removal efficiency data in %, calculated from the results in Fig. 8). This result also strongly suggests that the novel enzymes can provide beneficial PAH degradation in the absence of a soil-inherent microbial activity. In agreement with these findings,48 reported similar results showing that sterilization upon peroxide addition increased the degradation of pyrene because of the removal of competition from indigenous microbes.

The degradation efficiency for four PAH pollutants (naphthalene, phenanthrene, anthracene, and pyrene) was calculated comparing to the analytically determined initial concentration values (Table 3). The results show that the microflora of the control group properly adapted, and the decreased PAH concentrations could be attributed to the biodegradation by indigenous microbes. Moreover, our results demonstrated that the degradation efficiency generally decreased with the increasing number of aromatic rings. Applied enzymes enhanced the degradation efficiency, whilst adding CaO2 decreased the efficiency at PAHs associated with lower molecular masses. Remarkably, at naphthalene, the degradation efficiency was around 95% upon enzyme treatments without peroxide, while at the combined application of enzymes and CaO2, it reduced to 73–83%. The degradation efficiency had a similar pattern in the case of phenanthrene, where it was 68–71% with the addition of enzymes and 41–53% with enzymes plus CaO2 application.

On the other hand, in the case of anthracene, CaO2 had a positive effect since the 44–46% degradation efficiency with enzymes was further increased to 57–70% at the combined treatment. The highest favourable outcome of CaO2 was detected at the pyrene removal, where the degradation efficiency was 56–65% with enzymes and CaO2, whilst the enzymes alone degraded only 12–17% of this pollutant.

The degradation efficiency generally negatively correlated with the number of aromatic rings of the pollutants, so the degradation efficiency was the highest for naphthalene and the lowest for pyrene without the use of peroxide. This observation is in line with the expectation that more complex, condensed aromatic ring structures are more resistant to degradation. The combined administration of peroxide and enzymes was the most beneficial for highly stable pyrene containing four fused aromatic rings. Among these treatments, the PAH1_105 1,2-dioxygenase enzyme exhibited the highest effectiveness in the presence of CaO2.

Similarly to our research, some studies showed that the chemical oxidations by hydrogen peroxide represent promising pretreatment before the bioremediation for the removal of hydrophobic contaminants49,50,51; however, these treatments have not been used in enzymatic bioremediation. Liao et al. (2019) demonstrated that chemical oxidation is a great pretreatment option coupled with bioremediation to remove PAHs from soil49. They found that applying potassium permanganate enhanced the removal efficiency among other oxidants (activated persulfate, Fenton, and modified Fenton). On the other hand, permanganate decreased the microbial diversity and delayed the population recovery, whilst the Fenton treatment applied hydrogen peroxide, which only had a slight impact on indigenous microbial diversity.

As several studies reported52,53,54, applying specific enzymes in bioremediation offers some advantages compared to the use of microbial cells, including enhanced specificity, simplified handling and storage, higher mobility, and activity even in the presence of high concentrations of toxic contaminants. Furthermore, the enzymes do not require nutrient supplementation and may be applicable under extreme environmental conditions. Although the enzyme-based bioremediation methods seem advantageous clean-up technologies, some limitations of these technologies have been demonstrated54,55,56. Free enzymes are generally unstable, may be degraded and consequently exist for only a short time in a soil environment; the individually applied free enzymes cannot completely degrade the contaminants.

Although, in many cases, enzymes can usually carry out only the first transformation steps, these first biotransformation steps are usually the limiting factors for further biodegradation57. Therefore, the direct application of enzymes in contaminated environmental matrices, particularly in soils contaminated with highly degradable PAH compounds, is a feasible option, as our research illustrates.

The Biolog EcoPlates™ allows for detecting metabolic activities and physiological diversity of microbial communities in the environment41,58. Our study also demonstrated the applicability of the Biolog EcoPlate™ as a reliable tool for evaluating the activity of PAH-contaminated and remediated soils’ microbial community and a necessary method to complement the chemical analysis of the contaminants was well demonstrated.

In conclusion, the applied novel enzymes effectively degraded the contaminants; the used CaO2 slightly reduced the degradation rate in the case of naphthalene and phenanthrene while enhancing the removal of anthracene and pyrene. The novel enzyme-mediated bioremediation can be a feasible and efficient option in nutrient-poor contaminated soils with low biological activity.