Search
Close this search box.

Unravelling and reconstructing the biosynthetic pathway of bergenin – Nature Communications

Unraveling the biosynthetic pathway of bergenin

The biosynthetic pathway from GA (2) to bergenin (1) is unknown. GA may be catalyzed successively by a regioselective 4-O-methyltransferase (4-OMT), a regio- and stereoselective C-glycosyltransferase (2-CGT), and an enzyme for intramolecular esterification (Fig. 1a). Due to the unique structure of bergenin, no known genetic elements are available for specific C-glycosylation and O-methylation reactions for the reconstruction of artificial biosynthetic pathways for bergenin. Therefore, we attempted to elucidate the biosynthetic pathway of bergenin by identifying two key enzymes.

Bergenin was detected throughout the whole A. japonica plant; however, the contents in different organs (leaf, stem, and rhizome) varied, and the highest content was found in the leaf (Fig. 2a, b and Supplementary Table 1). According to the differential transcriptome data of the leaf, stem, and rhizome, 85 candidate glycosyltransferase genes were selected. Subsequently, phylogenetic analysis of these glycosyltransferase candidates and the previously reported CGTs indicated that two glycosyltransferase candidates (CL7566-1 and Unigene14257, named AjCGT1 and AjCGT2, respectively) and the typical plant CGTs were grouped into the same clade (Supplementary Fig. 3). Moreover, the expression levels of AjCGT1 and AjCGT2 in different organs were strongly correlated with the content of bergenin in the corresponding organs (Fig. 2c and Supplementary Tables 1 and 2); therefore, these two genes were chosen as CGT candidates for further functional identification.

Fig. 2: Functional identification of the 2-CGTs and 4-OMTs involved in the biosynthesis of bergenin.
figure 2

a The organs (leaf, stem, and rhizome) of A. japonica were used to analyze the contents of bergenin (1) and transcriptome sequencing. b The contents of bergenin (1) in different organs, all data represent the means of three parallel experiments and error bars show standard deviation. c Expression heat-map of two candidate CGT genes and ten OMT candidate genes (AjOMT1−10) in different organs. df The reactions catalyzed by AjCGT1 and AjCGT2 with GA (2) and UDP-Glc as substrates, and the corresponding HPLC-MS/MS2 spectra. gi The reactions catalyzed by AjCGT1 and AjCGT2 with 4-OMGA (3) and UDP-Glc as substrates, and the corresponding HPLC-MS/MS2 spectra. jl The reactions catalyzed by AjOMT2 with GA (2) and SAM as substrates, and the corresponding HPLC-MS spectra. mo The reactions catalyzed by AjOMT2 with norbergenin (6) and SAM as substrates, and the corresponding HPLC-MS spectra. Source data are provided as a Source data file.

The full-length cDNAs of AjCGT1 and AjCGT2 were cloned and expressed in the E. coli Transetta (DE3) strain (Supplementary Data 1 and Supplementary Fig. 4), respectively. The catalytic activity of recombinant AjCGT1 and AjCGT2 was assayed with the putative substrates GA (2) and 4-OMGA (3) as well as uridine diphosphate glucose (UDP-Glc). HPLC-MS analysis of the reactions showed that both AjCGT1 and AjCGT2 exhibited glucosylation activity toward 2 and 3, generating products 5 and 4, respectively, with the same retention time as the standards (Fig. 2d−i). MS analysis revealed that the [M−H] ions of 5 and 4 were 162 amu greater than those of 2 and 3, indicating that 5 and 4 were the corresponding mono-glucosylated products (Fig. 2f, i). The characteristic fragment ions [M−H−120] and [M−H−90] of 5 and 4 suggested that both were C-glucosylated products (Fig. 2f, i). Gallic acids are unstable in buffer solutions and easily undergo nonspecific binding with proteins33,34 (Supplementary Figs. 57). Thus, the apparent Km and kcat were distorted, and the specific activities of the AjCGTs were measured (Supplementary Table 3). Compared with 2, both AjCGT1 and AjCGT2 showed greater catalytic activity toward 3, suggesting that 3 is more likely the native substrate (Supplementary Table 3). Therefore, AjCGT1 (UGT708AL1) and AjCGT2 (UGT708AL2) were tentatively identified as CGTs responsible for the specific 2-C-glucosylation involved in the biosynthesis of bergenin. AjCGT1 also has C-glycosylation activity toward common CGT substrates with 2,4,6-trihydroxyacetophenone-like structures (Supplementary Figs. 812), while the seven tested CGTs showed no activity toward 2 and 3, as mentioned previously (Supplementary Figs. 1 and 2). Therefore, in contrast to other reported plant CGTs, the two CGTs displayed the ability to introduce C-sugars into the aromatic ring with acyl 3,4,5-pyrogallol structures.

To further investigate the mechanism by which AjCGT1 catalyzes gallic acid synthesis, a structural model of AjCGT1 was obtained according to the crystal structure of SbCGTa (PDB code 6LG0)28. GA and 4-OMGA were docked into AjCGT1, respectively (Supplementary Fig. 13). Fourteen candidate active sites located adjacent to the substrate binding pocket were selected. Alanine scanning assays revealed that 10 mutants (H23A, H27A, W92A, T142A, F147A, W151A, F188A, H367A, G368A, D369A, and Q370A) lost their activity. The key sites in the binding pocket ensure that gallic acid is in the proper orientation. The conserved His23 is generally considered a vital active site for the specific deprotonation of the hydroxyl group of the substrate, which makes it a nucleophile that attacks the sugar donor35, and the AjCGT1-H23A mutant completely lost its activity. However, two other basic candidate sites, His27 and W92, were found near the highly conserved His23 (Supplementary Fig. 14). In particular, the C-glycosylation activity of W92A toward gallic acid was completely lost, while that toward phloretin remained at approximately 15% (Supplementary Fig. 14). Therefore, the His27 and W92 basic sites might also act as general bases to deprotonate the hydroxyl group of gallic acid, helping His23 complete the C-glycosylation of gallic acid.

Owing to the relatively higher conversion rates of AjCGT1 (45.8% for 2, 61.5% for 3) than those of AjCGT2 (29.8% for 2 and 47.0% for 3) with GA (2) and 4-OMGA (3), AjCGT1 was selected as the target 2-CGT for further biochemical property investigations and engineered strain construction (Supplementary Figs. 15 and 16). Under pH 8.0, AjCGT1 exhibited the optimum activities at 35 °C and 40 °C for 2 and 3, respectively, and were independent of metal ions (Supplementary Figs. 17 and 18). Kinetic analysis demonstrated that AjCGT1 exhibited an apparent Km value of 265.3 μM for 4-OMGA (3). The corresponding Kcat/Km value (1.2 × 103 M−1 s−1 for 3) of AjCGT1 is lower than that of known CGTs, which might be affected by the instability of 3 during the reaction (Supplementary Fig. 19). When the stable substrate phloretin was used, AjCGT1 exhibited relatively high specific activity (Supplementary Table 3). Next, an engineered strain harboring the AjCGT1 gene was constructed and converted exogenously added 2 and 3 to 5 and 4, respectively (Supplementary Figs. 20 and 21). These results indicated that AjCGT1 was compatible with E. coli and can be used for reconstructing engineered strains for the de novo production of bergenin precursor 4.

Ten candidate methyltransferase genes (AjOMT1−10) for polyphenol biosynthesis with relatively high expression levels were screened from the differential transcriptome data of A. japonica (Supplementary Table 4). Among these genes, the expression levels of six genes (AjOMT2−7) were positively correlated with the levels of bergenin in the corresponding organs and the bait gene expression levels of AjCGT1 and AjCGT2 (Fig. 2c). Considering that genes related to the biosynthetic pathway of secondary metabolites in plants are usually co-ordinately regulated36, AjOMT2−7 and representative OMTs were selected for phylogenetic analysis (Supplementary Fig. 22). The results showed that AjOMT2 and AjOMT3 were clustered into the clade of caffeoyl-CoA OMTs, which tended to decorate one of the adjacent hydroxyl groups on the aromatic rings of the substrate37,38. Moreover, AjOMT2 and AjOMT3 were clustered close to AtCCoAOMT7, which displayed unusual para-methylation activity39. These findings suggest that AjOMT2 and AjOMT3 are involved in the biosynthesis of bergenin.

Accordingly, AjOMT2 and AjOMT3 were cloned from the cDNA of A. japonica and heterologously expressed in E. coli, and the recombinant enzymes were purified by Ni2+-NTA affinity chromatography (Supplementary Data 1 and Supplementary Fig. 23)40. To characterize the function of AjOMTs, two putative precursors (2 and 6) involved in bergenin biosynthesis were used as methyl acceptors, and S-adenosyl-L-methionine (SAM) was used as the methyl donor. For AjOMT2, a product showing an identical retention time and molecular ion peak at m/z 183.13 ([M−H]) to those of standard 3 (Fig. 2j–l) appeared in the reaction mixture of 2 with SAM. This demonstrated that recombinant AjOMT2 performed regiospecific methylation at the 4-OH of 2 to produce 3 with a conversion rate of 62.4%. In addition, HPLC-MS analysis revealed that AjOMT2 can also methylate 6 to form 1 with a conversion rate of 72.2% (Fig. 2m–o). The enzymatic assay showed that AjOMT3 also exhibited 4-O-methylation activity toward 2 and 6 in vitro; however, the catalytic activity was lower than that of AjOMT2 (Supplementary Figs. 24 and 25). In addition, AjOMT4−7 exhibited much lower or no catalytic activity (Supplementary Figs. 26 and 27). Therefore, both AjOMT2 and AjOMT3 might be regioselective 4-OMTs involved in the biosynthesis of bergenin, and AjOMT2 was selected for further biochemical property investigations and engineered strain construction.

Biochemical property investigations revealed that the optimum temperature and pH of AjOMT2 were 40 °C and 7.0, respectively. This enzyme was dependent on metal ions, and Mg2+ and Mn2+ could enhance its activity toward 2 and 6 (Supplementary Figs. 28 and 29). AjOMT2 showed methylation activity to 2 and 6 with similar conversion rates while kinetic properties demonstrated that AjOMT2 had apparent Km values of 72.0 μM and 493.6 μM for 2 and 6, respectively (Supplementary Fig. 30). These results implied that the native substrate of AjOMT2 might be 2 rather than 6 and that 4-O-methylation may occur before 2-C-glycosylation. In comparison with other similar OMTs, AjOMT2 had a moderate Kcat/Km value (250 M−1 s−1) (Supplementary Table 5). Moreover, the engineered E. coli strain harboring AjOMT2 also exhibited 4-O-methylation activity when 2 and 6 were exogenously added (Supplementary Figs. 31 and 32). Thus, AjOMT2 is an efficient and highly regioselective methyltransferase compatible with E. coli for artificial biosynthetic pathway construction.

Therefore, the specific 2-CGTs together with the regio-selective 4-OMTs accomplished the enzymatic synthesis of the specific C-glycoside 4-OMGA-Glc (4) from GA (2). Furthermore, the target bioactive product bergenin was conveniently obtained by acid treatment of 4-OMGA-Glc (4) (Supplementary Fig. 33). Thus, the biosynthetic pathway of bergenin was elucidated by identifying key enzymes and providing efficient genetic elements for the biosynthesis of bergenin in cell factories.

Reconstructing the biosynthetic pathway of 4-OMGA-Glc from GA

To reconstruct the biosynthetic pathway of the bergenin direct precursor 4-OMGA-Glc (4), the OMT-CGT (4-O-methyltransferase and 2-C-glycosyltransferase) module was designed by functional and effective coexpression of AjOMT2 and AjCGT1 in E. coli. Plasmids with different copy numbers were used to carry AjOMT2 and AjCGT1, different combination models of these recombinant plasmids were designed, and their catalytic efficiency and fitness were tested in vivo. AjOMT2 and AjCGT1 were cocloned and inserted into the high-copy number plasmid pETDuet-1, middle-copy number plasmid pCDFDuet-1 and low-copy number plasmid pACYCDuet-1, respectively, which were subsequently transformed into E. coli BL21 (DE3), leading to three engineered strains, E1–E3 (Supplementary Table 6). In the presence of endogenous UDP-Glc and SAM, 4 was detected in the culture media of all the strains after an additional 24 h of incubation after the addition of exogenous 2 at a final concentration of 0.4 mM through whole-cell biocatalysis (Supplementary Fig. 34). Moreover, the GA in the culture media of strain E1 harboring plasmid pET-AjOMT2AjCGT1 and strain E3 carrying plasmid pACYC-AjOMT2AjCGT1 were completely consumed, suggesting the high efficiency and fitness of AjCGT1 and AjOMT2. Therefore, the combination of AjCGT1 with AjOMT2 leads to the successful synthesis of the specific C-glycoside 4-OMGA-Glc (4) from GA (2).

De novo biosynthesis of the bergenin precursor 4-OMGA-Glc

To accomplish the de novo biosynthesis of the bergenin precursor 4-OMGA-Glc (4), we attempted to combine the exogenous biosynthetic pathway of GA (2) (GA-producing module) with AjOMT2 and AjCGT1 (OMT-CGT module) using E. coli as the chassis cell. GA is an important precursor of the biosynthetic pathway of bergenin; however, unengineered E. coli cannot supply GA. According to the designed artificial biosynthetic pathway of GA in E. coli (Fig. 1b), one of the vital factors is the lack of natural hydroxylase catalyzing 3,4-dihydroxybenzoic acid (3,4-DHBA) into GA. It has been reported that PobA-Y385F/T294A (PobA**), a mutant of p-hydroxybenzoate hydroxylase (PobA) from Pseudomonas fluorescens, exhibits high hydroxylase activity toward 4-hydroxybenzoic acid (4-HBA) and 3,4-DHBA to form GA20. Moreover, chorismate lyase (UbiC) was demonstrated to enhance the transformation of chorismate to 4-HBA with high catalytic activity in E. coli41. Therefore, for efficient de novo biosynthesis of GA, PobA**, and UbiC were used to construct the GA biosynthetic module (Fig. 3a). Exogenous PobA** and endogenous UbiC were constructed from the above three plasmids with different copy numbers. Subsequently, the plasmids were transformed into E. coli to create three engineered strains S1−S3, which contained the plasmids pET-PobA**UbiC, pCDF-PobA**UbiC, and pACYC-PobA**UbiC, respectively (Supplementary Table 6). GA was detected in the culture supernatant of the three engineered strains after 24 h of cultivation, respectively (Supplementary Fig. 35). The yield of GA in strain S1 reached 619 mg L−1, 26% and 59% higher than those in strains S2 and S3, respectively. Thus, a highly GA-producing pathway (GA-producing module) was constructed.

Fig. 3: De novo biosynthesis of the bergenin precursor 4-OMGA-Glc from glucose in E. coli.
figure 3

a The construction of the GA-producing module and OMT-CGT module using different plasmids. b Evaluation of the engineered strains A1–A6 for the titers of 4-OMGA-Glc (4) and metabolic intermediates in shake flask fermentation for 48 h. UbiC, chorismate lyase; PobA**, p-hydroxybenzoate hydroxylase with Y385F and T294A mutation; pT7, T7 promoter; RBS, ribosome binding site; T, terminator. All data represent the means of three parallel experiments and error bars show standard deviation. Statistical analysis was performed by using the Student′s t test. P value for each comparison from left to right then top to bottom in b): 0.0408 (*), 0.0475 (*), 0.0106 (*), 0.0047 (**), 0.0081 (**), 0.0048 (**), 0.0001 (**). Source data are provided as a Source data file.

Generally, increasing the copy numbers of functional genes is a strategy used to increase the level of gene expression; however, the process might impose greater physiological burdens on hosts and even decrease the yields of target products42,43. In this context, to determine the best combination mode of AjOMT2-AjCGT1 and PobA**-UbiC, three plasmids carrying AjOMT2-AjCGT1 with different copy numbers were introduced into the three GA-producing strains S1−S3 (carrying PobA**-UbiC), generating six strains, A1–A6, respectively (Table 1 and Fig. 3a). 4-OMGA-Glc (4) was detected in the fermentation broth supernatants of all six strains after expression was induced for 24 h (Supplementary Fig. 36), and the titer of 4-OMGA-Glc in the engineered strain A3 carrying the pCDF-PobA**UbiC and pET-AjOMT2AjCGT1 plasmids was the highest, reaching 418 mg L−1 at 48 h in a shake flask (Fig. 3b and Supplementary Fig. 37). Moreover, intermediate products during fermentation accumulated and varied among the different engineered strains (Fig. 3b). These findings suggested that when multiple genes were coexpressed on two different plasmids, the degree of compatibility between plasmids and genes strongly affected the yield of the target product. Overall, strains A1–A6 could produce 4-OMGA-Glc from the simple carbon source glucose; thus, the de novo biosynthetic pathway of 4-OMGA-Glc was reconstructed.

Table 1 Engineered E. coli strains A1–A6

Optimizing the biosynthetic pathway of the bergenin precursor 4-OMGA-Glc

As a large amount of GA accumulates in the culture medium of strain A3 (Fig. 3b), both the efficiency of the OMT-CGT module and the donor-supplying module should be further optimized to increase the yield of 4-OMGA-Glc based on the strain A3.

Semi-rational design guided by structural information is considered as an efficient and economical strategy for protein engineering to improve the catalytic activity of target enzymes20,44,45. Therefore, to further improve the catalytic efficiency of the OMT-CGT module, a strategy of semirational design was employed to increase the activity of AjOMT2 and AjCGT1 (Fig. 4). A structural model of AjOMT2 was generated using sorghum caffeoyl-CoA O-methyltransferase (CCoAOMT, PDB code 5KVA) as the template in complex with SAM46. GA and GA-Glc were docked into AjOMT2, respectively (Fig. 4a and Supplementary Fig. 38). Seven residues (A156, D157, K158, E159, N160, W185, and Y203) located adjacent (4 Å) to the substrate binding pocket of AjOMT2 were found, and six residues, excluding A156, were mutated to alanine. The K158A mutant almost completely lost its activity, revealing that K158 might act as a catalytic base and deprotonate the reactive hydroxyl group of gallic acid, which is similar to the reported catalytic mechanism46 (Supplementary Fig. 39). The catalytic activities of the mutants were further investigated in vivo through combining the wild-type AjCGT1. The strain harboring AjOMT2-Y203A and wild-type AjCGT1 significantly increased the conversion rate of GA to 4-OMGA-Glc (Fig. 4b). Subsequently, site-directed saturation mutagenesis of the Y203 residue was performed, and the Y203T variant exhibited the highest activity, which was 31% higher than that of wild-type AjOMT2 (Fig. 4c).

Fig. 4: Improving the catalytic efficiency of the OMT-CGT module by semirational design and codon optimization.
figure 4

a Protein modeling of AjOMT2 and AjCGT1. The protein structures were modeled using CCoAOMT (5KVA) and SbCGTa (6LG0) as templates, respectively. GA and GA-Glc were docked into AjOMT2, while GA and 4-OMGA were docked into AjCGT1 using AutoDock Vina. The candidate sites, acceptors (GA, GA-Glc, and 4-OMGA), and donors (SAM and UDP-Glc) were shown in blue, yellow, and green, respectively. b, c The in vivo relative catalytic activity of AjOMT2 mutants of alanine scanning and site-saturated mutagenesis of the Y203 site. d The in vitro relative activity of AjCGT1 mutants of alanine scanning. e The in vivo relative activity of AjCGT1 mutants obtained by Hot-spot analysis. f Engineered strains of E. coli harboring different combinations (M1–M3) of AjOMT2, AjOMT2* and AjCGT1, AjCGT1*. g The relative catalytic activity of the engineered strains of M1–M3. h Engineered strains of E. coli harboring different combinations (Y1–Y5) of AjOMT2, AjOMT2opt, AjOMT2*opt and AjCGT1, AjCGT1opt, AjCGT1*opt. i The relative catalytic activity of the engineered strains of Y1–Y5; AjOMT2*, AjOMT2-Y203T; AjCGT1*, AjCGT1-A332F; AjOMT2opt, codon-optimized AjOMT2; AjOMT2*opt, codon-optimized AjOMT2-Y203T; AjCGT1opt, codon-optimized AjCGT1; AjCGT1*opt, codon-optimized AjCGT1-A332F; All data represent the means of three parallel experiments and error bars show standard deviation. Source data are provided as a Source data file.

As none of the mutants exhibited increased catalytic activity according to the strategy of the structural model, molecular docking, and site mutation of AjCGT1, HotSpot Wizard was used to engineer AjCGT1 (Fig. 4d). The HotSpot Wizard is a web server focused on automated prediction of hotspot residues for mutagenesis that are likely to alter enzyme activity47. Then, the candidate sites V328, E329, E331, and A332 were selected and further mutated to V328A, E329A, E331A, and A332E/D/F/K/Q/V/Y, respectively, with the guidance of HotSpot Wizard (Fig. 4e). Among these mutants, AjCGT1-A332F exhibited the highest catalytic activity (138% that of the wild-type AjCGT1) (Fig. 4e).

Next, the AjOMT2-Y203T (designated AjOMT2*) and AjCGT1-A332F (designated AjCGT1*) mutants were coexpressed in engineered E. coli strains (M1–M3) (Fig. 4f). Surprisingly, the catalytic activity of the engineered strain harboring the plasmid pET-AjOMT2*AjCGT1* (M3) significantly decreased by 29% compared with that of the strain harboring wild-type AjOMT2AjCGT1 (E1) (Fig. 4g). However, the combinations of AjOMT2*AjCGT1 (M1) and AjOMT2AjCGT1* (M2) could improve the conversion rate of GA to 4-OMGA-Glc. Among the engineered strains, the M2 strain carrying pET-AjOMT2AjCGT1* exhibited the highest catalytic activity, of which the conversion rate was 38% higher than that of the strain harboring wild-type AjOMT2AjCGT1 (Fig. 4g).

In general, codon optimization is an effective way to improve the protein expression level in E. coli strains48. Therefore, AjOMT2, AjCGT1, AjOMT2* and AjCGT1* were codon-optimized based on the codon preference of E. coli. In total, five co-expression plasmids were designed, constructed and transformed into E. coli BL21 (DE3) to generate five strains, Y1–Y5, respectively (Fig. 4h and Supplementary Table 6). As expected, the catalytic efficiency of strain Y5 harboring AjOMT2opt and AjCGT1*opt (codon-optimized strain M2) increased by 44% (Fig. 4i). However, strain Y4 harboring codon-optimized AjOMT2*opt and AjCGT1opt (codon-optimized strain M1) showed the highest catalytic activity, of which the conversion rate increased by 65% compared with that of strain M2 (Fig. 4i). Hence, strain Y4 harboring pET-AjOMT2*opt-AjCGT1opt was used to construct the next generation of the engineered strain.

In the biosynthetic pathway of 4-OMGA-Glc (4), the donors (SAM and UDP-Glc) play crucial roles, and their adequate supply could help direct carbon flux toward target products49. According to the above results, the accumulation of GA in strain A3 suggested that the supply of endogenous SAM and UDP-Glc was insufficient. Therefore, to further increase the supply of methyl and sugar donors, five key enzymes involved in the biosynthesis of SAM (5′-methylthioadenosine/S-adenosylhomocysteine nucleosidase, mtn; S-ribosylhomocysteine lyase, luxS; methionine adenosyltransferase, metK)49 and UDP-Glc (phosphoglucomutase, pgm; glucose-1-phosphate uridyltransferase, galU)50 were utilized for engineering E. coli (Fig. 5a, b). These five endogenous enzyme-encoding genes individually or in combination were overexpressed together with PobA** and UbiC in strain Y4, generating an additional three engineered strains, D3–D5 (Fig. 5c). Strain D1 contained the plasmids pCDF-PobA**UbiC and pETD-AjOMT2*optAjCGT1opt, and strain D2 was constructed by introducing pACYCDuet-1-blank into strain D1 (Fig. 5c). Strain A3 was also used as a control to compare the effect of this optimization.

Fig. 5: Boosting the SAM and UDP-Glc supplies to improve the yield of 4-OMGA-Glc de novo.
figure 5

a Modular biosynthetic pathway of 4-OMGA-Glc (4) in engineered E. coli. b Biosynthetic pathways of UDP-Glc and SAM as well as the SAM regeneration system in E. coli. c Reconstruction of engineered strains D1–D5 in E. coli BL21. d Evaluation of the engineered strains D1–D5 for titer of 4-OMGA-Glc (4) and metabolic intermediates in shake flask fermentation for 48 h. AjCGT1opt, codon-optimized AjCGT1; AjOMT2*opt, codon-optimized AjOMT2-Y203T mutation. S-adenosyl-L-homocysteine (SAH) can be catalyzed by 5′-methylthioadenosine/S-adenosylhomocysteine nucleosidase (mtn) to produce S-ribosylhomocysteine (SRH), which can be further converted to homocysteine (Hcys) by S-ribosylhomocysteine lyase (luxS). Hcys can be further transformed to methionine (Met), which is finally catalyzed by methionine adenosyltransferase (metK) to form SAM. Phosphoglucomutase (pgm) catalyzes glucose-6-phosphate to form glucose-1-phosphate which is catalyzed by glucose−1-phosphate uridyltransferase (galU) to form UDP-Glc. All data represent the means of three parallel experiments and error bars show standard deviation. Statistical analysis was performed by using the Student′s t test. P value for each comparison from left to right in d): 0.0010 (**), 0.0024 (**), 0.0065 (**), 0.0014 (**), 0.0448 (*), 0.0094 (**), 0.0045 (**), 0.0143 (*), 0.0058 (**). Source data are provided as a Source data file.

Compared with that in strain A3, the yield of 4-OMGA-Glc in strain D1 increased by 34%, reaching 560 mg L−1 after fermentation for 48 h (Fig. 5d). Surprisingly, when a third empty plasmid was introduced into the 4-OMGA-Glc-producing strain D1 to obtain strain D2, the yield of the target product decreased by 26% in 48 h (Fig. 5d). Introducing extra plasmids might increase the metabolic burden and stress to host cells, thereby reducing the accumulation of target products. Therefore, the metabolic fluxes and protein expression levels in biosynthetic pathways containing multiple genes should be accurately regulated to maximize the 4-OMGA-Glc titer. When metK was overexpressed in strain D3, the titer of 4-OMGA-Glc decreased by 6% compared with that in strain D2 after 48 h (Fig. 5d). This result suggested that direct overexpression of metK does not improve the SAM supply, probably due to the complicated regulation of the SAM biosynthetic pathway in E. coli. Reasonably, when mtn and luxS were overexpressed in strain D4, the titer of the product 4-OMGA-Glc was increased by 46% (reaching 610 mg L−1) compared with that in strain A3 after 48 h (Fig. 5d). These findings indicated that constructing a SAM regeneration system to increase SAM availability is a reliable strategy for increasing the yield of 4-OMGA-Glc.

For the sugar donor, the biosynthetic pathway of UDP-Glc commonly occurs in organisms; however, further increasing the UDP-Glc supply might increase the yield of the corresponding glycosides. Therefore, two key enzymes (pgm and galU)50 for its biosynthesis were overexpressed in strain D5. Surprisingly, the titer of the product 4-OMGA-Glc decreased by 27% compared with that of strain A3 after 48 h (Fig. 5d), probably due to their poor fitness. It was speculated that the overexpression of pgm and galU caused carbon flux to UDP-Glc, which might affect the growth of E. coli cells. Future work involving the overexpression of sucrose synthase (SuSy) might be an alternative method for increasing the supply of UDP-Glc45.

Overall, by optimizing the OMT-CGT module, the 4-OMGA-Glc titer of strain D1 increased to 560 mg L−1 at 48 h. With further optimization of the donor-supplying module, the highest amount of 4-OMGA-Glc reached 610 mg L−1 at 48 h in the resultant strain D4, which was 46% higher than that of the starting strain A3.

Fed-batch production of 4-OMGA-Glc in a bioreactor

To evaluate the scale-up potential of D4 for 4-OMGA-Glc production, fed-batch experiments were carried out in a 3-L bioreactor. After 24 h of fermentation, the upstream intermediate products 3,4-DHBA and GA started to be produced and accumulated in maximum yields of 314 mg L−1 and 70 mg L−1, respectively. After that, the yield of the target product 4-OMGA-Glc gradually increased, reaching a maximum titer of 1.49 g L−1 after 60 h of fermentation (Fig. 6). In previous studies, great progress was made in the de novo production of plant-derived phenolic C-glycosides from glucose. The concentrations of puerarin (daidzein 8-C-glucoside), isovitexin (apigenin 6-C-glucoside), and apigenin di-C-arabinoside produced by engineered yeast reached 72.8 mg L−1, 206 mg L−1 and 113.16 mg L−1, respectively16,51,52. A total of 93.9 mg L−1 vitexin (apigenin 8-C-glucoside) was produced by engineered E. coli53. In our work, the titer of C-glycosides was improved to the gram level per liter.

Fig. 6: De novo biosynthesis of 4-OMGA-Glc with a 3-L fed-batch bioreactor.
figure 6

Strain D4 was used for scale-up experiments. Detailed fermentation conditions were shown in “Methods”. All the product 4-OMGA-Glc (4) was conveniently esterified into bergenin (1) by acid treatment and the yield of bergenin reached 1.41 g L−1 (Supplementary Fig. 42). ND: not detected. All data represent the means of three parallel experiments and error bars show standard deviation. Source data are provided as a Source data file.

The desired 4-OMGA-Glc was the main product in the fermentation broth. Although a large amount of GA accumulated during fermentation in shake flasks (Fig. 5d and Supplementary Fig. 40), nearly all the GA in these fed batches was consumed by downstream enzymes (Supplementary Fig. 41). Therefore, after optimization, the biosynthetic genes functionally and efficiently worked together in engineered E. coli, with the highest yield of 4-OMGA-Glc ever reported. Moreover, the engineered strain showed great potential for increasing the production of the target C-glycoside. After fermentation, when the pH of the broth was adjusted to 0.1 with HCl, 4-OMGA-Glc was completely esterified to bergenin (Fig. 6 and Supplementary Fig. 42) with a yield of 1.41 g L−1. Due to the low content of byproducts during fermentation, bergenin was conveniently isolated and purified to meet commercial demand. Therefore, we rebuilt an artificial biosynthetic pathway for bergenin and improved its yield to the gram level per liter, demonstrating great potential for addressing the resource scarcity issue.

In the present work, two key enzymes (AjCGTs and AjOMTs) responsible for the biosynthesis of bergeinin were functionally identified, and the biosynthetic pathway involved was completely elucidated. AjOMT2 and AjOMT3 exhibited highly regioselective 4-O-methylation of GA. The two CGTs (AjCGT1 and AjCGT2) were characterized as CGTs of gallic acids, showing strict regio- and stereospecificity for the formation of 2-C-β-D-glycosides, different from other reported plant CGTs. Subsequently, the de novo artificial biosynthetic pathway of the bergenin direct precursor 4-OMGA-Glc was reconstructed in E. coli. After that, the optimization of the cell factory producing target rare C-glycosides was successively and systemically performed, including the GA-producing module, the OMT-CGT module and the donor-supplying module. The yield of the target product was improved, with the highest reported yield of 4-OMGA-Glc to date. Moreover, 4-OMGA-Glc was conveniently transformed to bergenin after the pH of the culture was adjusted in situ. This work illustrates the potential of a bacterial fermentation system to increase the yield of rare C-glycosides through metabolic engineering and establishes a method for the biochemical synthesis of bergenin. The efficient and green production of bergenin from glucose lays a foundation for the supply of bergenin.