Skip to main content

Resurrecting ancestral antibiotics: unveiling the origins of modern lipid II targeting glycopeptides - Nature.com

Abstract

Antibiotics are central to modern medicine, and yet they are mainly the products of intra and inter-kingdom evolutionary warfare. To understand how nature evolves antibiotics around a common mechanism of action, we investigated the origins of an extremely valuable class of compounds, lipid II targeting glycopeptide antibiotics (GPAs, exemplified by teicoplanin and vancomycin), which are used as last resort for the treatment of antibiotic resistant bacterial infections. Using a molecule-centred approach and computational techniques, we first predicted the nonribosomal peptide synthetase assembly line of paleomycin, the ancestral parent of lipid II targeting GPAs. Subsequently, we employed synthetic biology techniques to produce the predicted peptide and validated its antibiotic activity. We revealed the structure of paleomycin, which enabled us to address how nature morphs a peptide antibiotic scaffold through evolution. In doing so, we obtained temporal snapshots of key selection domains in nonribosomal peptide synthesis during the biosynthetic journey from ancestral, teicoplanin-like GPAs to modern GPAs such as vancomycin. Our study demonstrates the synergy of computational techniques and synthetic biology approaches enabling us to journey back in time, trace the temporal evolution of antibiotics, and revive these ancestral molecules. It also reveals the optimisation strategies nature has applied to evolve modern GPAs, laying the foundation for future efforts to engineer this important class of antimicrobial agents.

Introduction

Natural products form one of the most important sources of medicinal compounds, with modern medicine reliant on antibiotics that often originate from biosynthesis in various microorganisms1. Indeed, thousands of compounds have been isolated from natural sources—with many more predicted—that display enormous structural diversity. These compounds are mainly produced as so called secondary or specialised metabolites by organisms and represent important adaptive characteristics that have been subjected to natural selection during evolution2,3. Given the importance of biosynthetic processes to access complex natural products at scale, understanding how such pathways evolve is crucial information if we are to successfully reengineer such assemblies to allow the formation of new, designer compounds with improved properties.

The biosynthesis of natural products is typically encoded by biosynthetic gene clusters (BGCs), which usually include genes for precursor and core biosynthesis, post-core biosynthesis, regulation, resistance, and transport. Genome analysis has shown that BGCs—and hence natural products—evolve through a range of processes including the recombination of specific subclusters, gene conversion, gene duplication and horizontal gene transfer4. However, although some evolutionary models have been proposed2,5, little is understood about the molecular mechanisms of how natural products, arguably the largest and most economically important source of chemical diversity on the planet, have evolved. Recent exciting work to address this has made use of ancient DNA to investigate natural products from the Pleistocene era6, although longer evolutionary timescales are doubtless challenging for such an approach.

In this work, we have sought to understand the evolutionary history of lipid II targeting glycopeptide antibiotics (GPAs), a vital class of nonribosomal peptides used in the clinic for the treatment of resistant bacterial infections and exemplified by vancomycin (Van) and teicoplanin (Tei) (Fig. 1)7,8. Various types of GPAs are known9, which extend beyond the lipid II targeting GPAs under investigation here to type V GPAs such as corbomycin10 and kistamicin11 that possess altered structures and modes of action (Fig. 1). All GPAs contain a multicyclic peptide core structure that is assembled through the combined activity of a nonribosomal peptide synthetase (NRPS) machinery12 and cytochrome P450 monooxygenases, which catalyse a cascade of oxidative crosslinking reactions13. The peptide core of GPAs is largely composed of aromatic amino acids including nonproteinogenic amino acids such as 4-hydroxyphenylglycine (Hpg), 3,5-dihydroxyphenylglycine (Dpg), and β-hydroxytyrosine (Bht). Curiously, one key difference across the biosynthetic pathways for lipid II targeting GPAs is the formation of Bht, which is either obtained by tyrosine (Tyr) oxidation on the NRPS as in the teicoplanin (Tei) pathway or generated offline and directly incorporated as in the vancomycin (Van) pathway. Beyond variations in Bht formation and the core peptide, diversity within the GPA family is expanded yet further through modifications to the post-peptide assembly process7,8.

Fig. 1: Structures of different GPA types.
figure 1

Structures of GPAs from all representative types (Van, Pek, Avo, Ris, Tei) shown together with the alternative type I-V nomenclature. Compound abbreviation type GPA naming is used in this manuscript except for the type V GPA outgroup. Sugar abbreviations: D-arabinose (ara), D-glucosamine (gls), D-glucose (glc), 2-O-methyl-D-glucose (Me-glc), D-mannose (man), L-rhamnose (rha), L-ristosamine (ria), L-vancosamine (van). FA (fattyacyl), Ac (acetyl).

Full size image

With the exception of type V GPAs, GPAs function by interrupting bacterial cell wall biosynthesis through the sequestration of the peptidoglycan precursor lipid II7. Whilst lipid II targeting GPAs—such as Tei and Van—share a conserved mechanism of action, they differ in the structures of their peptide cores and the BGCs encoding these antibiotics. Earlier phylogenetic reconciliation indicated that the origins of glycopeptide biosynthesis can be traced back to a timeframe of around 150–400 million years ago14.

Our results show that modern lipid II targeting GPAs have evolved from an ancestor—here termed paleomycin—whose predicted core resembles the more complex structure of Tei, suggesting Van-type GPAs are more recent examples of GPA evolution (Fig. 2). We have reconstituted the predicted ancestral NRPS assembly line encoding the paleomycin core peptide, demonstrated production of an antibiotic bearing the core structure of paleomycin from this NRPS, and identified the roles of assembly line recombination and domain mutation in the generation of modern GPAs. Finally, we have obtained structural snapshots of key selection domains during the evolution of modern lipid II GPAs, providing crucial insights into the general evolution of NRPS-produced peptides.

Fig. 2: The evolution of GPA biosynthesis from the predicted ancestral GPA paleomycin to modern GPAs such as vancomycin.
figure 2

Possible modifications to paleomycin inferred from ancestral reconstruction of the BGC include chlorination (X1, X2), hydroxylation (R1, R2) and glycosylation (Sugar1: glucose, Sugar2: mannose, right panel). NRPS assembly lines are shown for each compound, with modules (collections of domains able to install one amino acid into the growing peptide, depicted as rounded rectangles) coloured coded by the amino acid selected (see key). Domain/enzyme description: A adenylation (orange indicates evolution to Leu selection from Hpg, black indicates evolution to Bht selection from Tyr), C condensation, E epimerisation, TE thioesterase, PCP peptidyl carrier protein, X cytochrome P450 recruitment, COM intermodule communication, Hal halogenase (chlorination, green), Hyd non-heme iron monooxygenase (hydroxylation, brown), Oxy cytochrome P450 (crosslinking, pink). Possible chlorination (X1, X2) during assembly indicated by ±Cl; possible hydroxylation (R1, R2) during assembly indicated by half yellow/white amino acid colouring. Sugar abbreviations: D-glucose (glc), L-vancosamine (van).

Full size image

Results

The ancestral lipid II targeting GPA paleomycin is predicted to be a complex peptide structurally related to teicoplanin

To understand the evolution and diversification of genes involved in the biosynthesis of Van/Tei-GPAs displaying lipid II targeting (tricyclic (Van/Pek/Avo-type) GPAs; tetracyclic (Ris/Tei-type) GPAs), we generated a guide tree based on 29 complete BGCs (Table S1). In doing so, we deliberately excluded the type-V GPAs whose evolutionary origins have been previously explored by Wright and co-workers14 and which led to the discovery of the mechanism of autolysin inhibition shown by type V GPAs10. Whilst this study also investigated the origins of lipid II targeting GPAs, we felt that the use of a species-centric approach did not sufficiently reflect the evolution of BGCs4. We adopted a molecule-centric perspective, treating BGCs as distinct and separate entities, which involved constructing a guide tree to serve as an equivalent of a species tree for the BGCs. Our main goal in doing so was to understand the gene gain/loss and the synchronicity of evolutionary events in the history of BGC evolution, with a focus on the genetic content of the BGCs and the simultaneous changes in the molecular structures of the encoded GPAs. To create the guide tree, we utilised aligned and concatenated GPA-NRPS genes from 29 complete GPA BGCs (Supplementary Fig. S2), which we chose because these NRPS genes offer the most accurate reflection of the BGCs' phylogenetic history. This approach allowed us to establish a BGC species tree in similar manner to traditional species trees that rely on multiple conserved vertically inherited concatenated housekeeping genes. In this context, the most conserved congruent core domains within the GPA BGCs were treated as equivalent to housekeeping genes, enabling us to independently map gene gain and loss of tailoring enzymes, regardless of the bacterial species in which these events occurred.

We observed that specific clades were mixed within different genera, suggesting the occurrence of horizontal gene transfer (HGT) during GPA evolution. This was clearly visible when comparing the guide tree of GPAs with a species tree of GPA producers (Fig. 3).

Fig. 3: Tanglegram of a species tree of GPA producers versus the gene tree of the respective GPA biosynthesis (NRPS-encoding) genes.
figure 3

The species tree is a multilocus sequence tree based on concatenated housekeeping genes; Tistlia consotensis USBA 355 is used as the outgroup (OG). The types of GPA encoded are colour coded in the GPA gene tree (corresponding to the colours shown for each representative GPA type in Fig. 1). Grey circles represent type V GPAs used as outgroup in the NRPS tree, black circles represent computationally predicted GPAs where the product has not yet been characterised.

Full size image

The resulting tanglegram provided evidence for multiple events of HGT into Amycolatopsis and Streptomyces. This process was accompanied by major rearrangements of gene synteny/order and explains the production of different types of GPAs by both genera. With GPA evolution not following species phylogeny, we inspected the main clades of the tree to determine the structure of the encoded GPA. Our analysis revealed that the major clades correlate with the peptide core of the GPA structure they encode (Fig. 4). Furthermore, the major rearrangements coinciding with HGT events do not necessarily result in the production of a new type of GPA. Most curiously, the distribution patterns of different types of GPA within the tree clearly showed that the more complex tetracyclic GPAs (like Tei) are in fact more like ancestral GPAs, while the Van tricyclic GPAs have undergone structural simplification. Ancestral state reconstruction considering the core biosynthesis genes present in GPA BCGs allowed us to predict the genes present in the BGC of the ancestral lipid II targeting GPA—here termed paleomycin – and revealing this putative ancestral GPA to be a tetracyclic peptide containing the same proteinogenic (Tyr) and non-proteinogenic (Bht, Hpg, Dpg) residues found in Tei-type GPAs. Ancestral state reconstruction also suggested that generation of the Bht precursor in paleomycin biosynthesis followed the Tei pathway (hydroxylation of NRPS-bound Tyr by a non-heme iron oxygenase (hydroxylase; Hyd)), and further that halogenation (99.9% likelihood) and glycosylation (99.3% likelihood for glucose at Hpg-4 (position 4 of the peptide); 97.5% likelihood for mannose at Dpg-7) occurred during the biosynthesis of paleomycin. The presence of an N-methyltransferase was less certain (53.8%), and thus this was not included in the predicted structure of paleomycin (Figs. S4–6, S8, S10–11, S17–18; S20–24, S26, S29).

Fig. 4: Diversification of the glycopeptide antibiotics (GPAs).
figure 4

Major diversification events during the evolution of GPA biosynthesis (analysed with ancestral state reconstruction) are indicated on the phylogenetic tree of GPA biosynthetic genes by arrows pinpointing when major structural changes occurred during evolution. The GPA types are indicated (I-IV), with the two clusters of Ris/Tei-type GPAs differing in the mechanism of β-hydroxytyrosine (Bht) incorporation during biosynthesis (online: Tyr hydroxylation by a β-hydroxylase on main NRPS (orange); offline: formation of Bht by hydroxylation of Tyr on a separate minimal NRPS module (white)). Predicted GPA structures indicated for nodes 1, 4/7 and 8. M (module).

Full size image

Reconstitution of paleomycin biosynthesis yields an active GPA

To validate the structural predictions based on our bioinformatic analyses, we next set out to reconstitute the biosynthesis of the peptide core of paleomycin (Fig. S31). The bioinformatically inferred DNA sequences of the ancestral NRPS genes (27.859 kb, nrpsanc; with identities between 76% and 85% to the NRPS genes of ristomycin (Table S4)) were synthesised in their entirety by ATG:synthetics (Merzhausen, Germany) and cloned into the p3SV vector, allowing integration into the surrogate host chromosome. To increase GPA expression levels, we introduced the strong artificial constitutive promoter Sp44* upstream of the nrpsanc genes, resulting in the plasmid pDI1 (Fig. S32). As a chassis for the expression of nrpsanc, we chose Amycolatopsis japonicum MG417-CF17, the producer of the tetracyclic GPA ristomycin (Ris, also known as ristocetin), as this strain possesses genes encoding key biosynthetic enzymes (peptide crosslinking P450 enzymes) as well as enzymes for the biosynthesis of the non-proteinogenic amino acids Hpg and Dpg15. In addition, A. japonicum MG417-CF17 encodes isoforms of the gatekeepers of the shikimate pathway, 3-deoxy-D-arabino-heptulosonate 7-phosphate (Dahp) synthase and prephenate dehydrogenase (Pdh), which are important to enhance GPA yield and found in most GPA producers. We first engineered A. japonicum to selectively remove the ristomycin NRPS genes rpsA-D by homologous recombination using the plasmid pGUSA21_RistoKO containing upstream and downstream flanking regions (1.3 and 1.5 kb, respectively) of rpsA-D (Figs. S33–S35). To ensure transcription of the remaining genes of the Ris BGC, a second copy of the StrR family regulator gene bbR15 under the control of the strong constitutive promoter permE* was integrated into the genome of the A. japonicum via the ΦC31 att site.

We next addressed differences in the pathways that lead to the generation of Bht for ristomycin production (Van-type) compared to paleomycin (Tei-type). We replaced the "offline" Van-type Bht forming cassette (including the three genes oxyD, rpsE and bhp) with that of a Tei-type non-heme iron oxygenase (hydroxylase) from Nonomurea gerenzanensis ATCC 39727 (the producer of A40926, the precursor of dalbavancin), which acts "online" directly on the main NRPS (Figs. S36–S38). The absence of a halogenase gene in the Ris cluster was compensated by the introduction of the halogenase gene from N. gerenzanensis ATCC 3972716,17. Finally, we introduced the plasmid pDI1 via intergeneric conjugation to generate A. japonicum DI_nrpsanc using this optimised host. Comparative metabolic analysis revealed the presence of a distinct peak in the culture filtrate of A. japonicum DI_nrpsanc (Figs. S31, S39). The putative GPA was extracted, and tetracyclic Tei-like GPA compounds analysed by liquid chromatography high resolution tandem mass spectrometry (LC-HR-MS/MS), Fig. 5). Detailed coupled HPLC-MS and MS/MS analysis in combination with molecular networking18 revealed a set of related ristomycin/paleomycin hybrid GPAs built from a common core paleomycin peptide backbone with distinct structural modifications installed by the Ris host-specific machinery (Figs. 5, S40–41). Inhibition assays confirmed the biological activity of these GPA extracts towards Bacillus subtilis (with no activity observed from the negative control, see Fig. S42), demonstrating the biosynthesis of the peptide core of paleomycin by the ancestral NRPS and its successful cyclisation by Oxy enzymes as predicted by bioinformatic analyses.

Fig. 5: GPA products from the integration of synthetic paleomycin NRPS genes into the modified ristomycin GPA (Ris) producer strain Amycolatopsis japonicum.
figure 5

HR-MS/MS based molecular network analysis18 of the products of the modified GPA producer confirm the biosynthesis of ristomycin/paleomycin hybrid derivatives comprising the anticipated core peptide, halogenation pattern and glycosylation. Sugar abbreviations: D-glucose (glc), L-ristosamine (ria).

Full size image

The evolution of paleomycin simplifies the GPA scaffold whilst retaining activity

Having revealed the composition of the BGC of paleomycin and the activity of this ancestral GPA, we next investigated the evolutionary pathway towards modern, simplified Van-type GPAs. Ancestral state reconstruction revealed at what stage major changes, including gene gain/loss, gene merger and amino acid exchange, occurred in those BGCs throughout their evolution (Fig. 4). Our analysis showed that significant changes occurred simultaneously in the ancestral node of all Van-type GPAs: two of the seven aromatic amino acids were altered from Hpg-1 and Dpg-3 into proteinogenic aliphatic amino acids Leu-1 and Asn-3 for Van and Ala-1 and Glu-3 for the Van-type GPA pekiskomycin (Pek)19. The loss of Hpg-1 and Dpg-3 prevents the typical F-O-G crosslink between these residues, leading to the loss of the OxyE P450-encoding gene from the BGCs of tricyclic Van-type GPAs. The fusion between module M2 and M3 of the NRPS also occurred at this juncture. Another important development in GPA evolution was the change in the biosynthesis of Bht from the Tei-type to the Van-type offline route of Bht supply (Figs. S5–S6). Reconstruction of the evolutionary history of GPA tailoring enzymes showed that these evolved either through duplication or were acquired via HGT, mostly from other Actinobacteria (SI Ancestral State Reconstruction; Figs. S7, S9, S12–16, S19, S25, S27–28). Indeed, gene loss/gain occurred far more often than modifications leading to alterations in the NRPS backbone, which is consistent with the retention of lipid II targeting activity primarily provided by the tricyclic GPA peptide backbone.

Considering the possible implications for engineering approaches, we delved deeper into the events that led to the alteration in the core GPA peptide sequence towards aliphatic amino acids. NRPS assembly lines are highly complex and dynamic20,21,22,23, making rational engineering challenging24,25. To understand how the adenylation (A)-domains26—key amino acid selection and activation domains that utilise an amino acid selection pocket that can be bioinformatically predicted27—found in these NRPS modules changed their specificity towards different amino acids, we performed a phylogenetic analysis of the different GPA modules (Fig. S43), revealing that both the amino acid selection A-domain and peptide bond forming condensation (C)-domain within M3 of tricyclic Van-type GPAs do not clade with other M3 domains14. This suggests a different evolutionary origin for these domains—the recombination of M3—likely concomitant with the fusion of M2 with M3 to overcome the loss of protein/protein interactions resulting from the recombination event. The change in amino acid specificity in M1, however, was predicted to occur by point mutation rather than recombination. To explore these results, we performed sliding window analyses28 to assess recombination among the GPA NRPS genes (Fig. S30), which clearly showed recombination of C-/A-domains in M3 but not M1 for Van-type GPAs. Further evolution of position 3 of Van-type GPAs is likely (65% probability predicted by ASR) to have occurred from Van to Pek scaffolds, possibly via a second recombination event encompassing only the M3 A-domain. Thus, two different biosynthetic engineering mechanisms have occurred during GPA evolution towards Van-type GPA scaffolds, of which the point mutation strategy for altering A-domain function remained frustratingly opaque (Fig. 44).

Evolving A-domains as a molecular pathway towards vancomycin production

To explore the process of natural mutation of the A-domain seen in GPA evolution, we assembled an enlarged phylogenetic tree based on the concatenation of the 7 A-domains in each of the 51 NRPSs known or predicted to produce a GPA scaffold (Figs. 6A, B, S45). This larger tree was used to more accurately infer sequence changes that happened at more modern nodes, further removed from paleomycin and closer to extant sequences. The tree topology shows that Van/Pek NRPSs are monophyletic and derived from a single common ancestor. From this tree, we then selected four key ancestral nodes (ANC1, ANC2, ANC3leu and ANC4) (Figs. 6A, S45) that we hypothesised would unveil the mechanism by which the substrate specificity of the A-domain in the first NRPS module (A1) evolved in tricyclic Van-type GPAs. Ancestral A1-domain codon sequences were inferred from this phylogeny, with the average posterior probability of the reconstructed codon sequences in the four ancestral A-domains ranging from 0.88 to 0.91 (Fig. 46).

Fig. 6: Evolution of M1 A-domain in GPAs has proceeded via point mutation.
figure 6

A Condensed molecular phylogeny of GPA producing NRPSs based on concatenated MSAs of protein and codon sequences of A-domains (note: A-domain from module 6 of the teicoplanin NRPS is coloured as Tyr as this is the AA accepted by this A-domain). Nodes are labelled with bootstrap probability (500 reconstructions). The resurrected ancestral nodes marking the emergence of the Van/Pek scaffold are shown as coloured circles. A-domain specificity-conferring code shows the difference in amino acids aligned to the substrate binding pocket and residue changes are colour-coded to ancestral nodes (ANC1-4). Each clade is annotated with the amino acid substrate of A-domain 1–7 in the assembly line. The fully annotated phylogeny is presented in Fig. S45. B Similarity matrix of A-domains from module (M)1 (top) and M3 (bottom) used in construction of molecular phylogeny. A-domains from Van/Pek scaffold are boxed in orange and cyan for Leu and Ala selecting domains, respectively. A-domains from Van/Pek scaffold are boxed in dark red and pink for Asn and Glu selecting domains, respectively. Type V Hpg selecting domains are boxed in black. C A-domain activity of ancestral A-domains ANC1-4 and extant A1tei determined by NADH coupled PPi assay (kcat, min−1). Data are presented as mean values +/− SEM, n = 3. D- and L-form of amino acid substrate tested is shown as yellow and pink bars, respectively, with the lower activity measurement shown in the foreground of the stacked bars.

Full size image

Next, we characterised the molecular pathway of GPA evolution by expressing, isolating, and characterising these four ancestral A-domains in terms of their activity towards the anticipated substrates Hpg, Ala and Leu (both D- and L- forms due to the inclusion of a D-configured residue at position 1 despite the lack of E-domain in these modules)29. All ancestors were co-expressed with the MbtH-like protein Tcp13, as the presence of a comparable gene in all modern GPA clusters analysed to date implies that such proteins were also necessary and present in the ancestral gene clusters. The expressed domains were all active, displaying kcat values between 0.08–0.89 min–130, which are comparable to the Tei A1 domain (A1tei; Fig. 6C). ANC1 exhibited an activation rate of 0.75 min–1 for L-Hpg, with a fourfold preference for the L-form. ANC2 and ANC3 were both selective for Leu, with ANC3 showing 1.7-/3.8 fold higher kcat value for L-Leu/D-Leu. Activity of ANC2int is consistent with this being an intermediate pocket, showing reduced activity that is improved through natural selection, and supports Van-type GPAs as being older than Pek-type GPAs. ANC4 was highly specific for D-Ala, with an activation rate of 0.89 min–1, >15 fold higher than activity towards L-Ala. Whilst unexpected, this preference is consistent with D-Ala being present due to cell wall biosynthesis and remodelling of the cell wall that occurs upon GPA biosynthesis31,32.

To visualise how the ancestral GPA Hpg pocket evolved to select alternate substrates, we turned to structural methods. Since ANC1-3 domains display polydisperse elution profiles in size exclusion chromatography (Fig. 47), we instead turned to the characterisation of an Acore construct of the Hpg accepting A1 domain from Tei biosynthesis (A1tei), which possesses a homologous pocket to ANC1. A1tei was crystallised together with the MbtH-like protein Tcp13 (PDB ID 8GJ4) to 2.70 Å resolution and also with the substrate L-Hpg to 1.64 Å resolution (PDB ID 8GIC; Fig. 7A, Table S6), revealing a structure highly similar to related enzymes from the adenylate forming enzyme family (e.g., comparison to PheA (1AMU)33, RMSD of 1.66 Å). The MbtH-like domain Tcp13 also exhibits the anticipated fold, containing a 3-stranded antiparallel β-sheet and an α-helix adjoining the centre of the sheet34,35. A365 of A1tei is sandwiched between two (W25 and W35) of the three tryptophan residues found in Tcp13. These tryptophan residues, strictly conserved in MbtH-like proteins, are crucial for complex formation34,35, and are located at the end of the second β-strand (W25), in the subsequent loop (W35), and in the C-terminal region behind the first α-helix (W54). The Hpg substrate is coordinated by three H-bonds to the α-amino group, one to the side group of D196 and two to the backbone carbonyl groups of L295 and G289. The aromatic ring of L-Hpg is stabilised by hydrophobic interactions with the sidechain of L295 and the main chain of G264, whilst the phenol moiety of L-Hpg is hydrogen bonded to H237 and to the amide nitrogen of G263 (Fig. 7A). The orientation of the H237 imidazole is maintained through three key interactions, two being hydrophobic interactions between the imidazole C5 and L261/L287. A water mediated interaction between E201 and the imidazole Nπ further contributes to imidazole positioning. This Glu residue is widely conserved among NRPS A-domains and serves to maintain the desired orientation of the loop immediately prior to the key α-amino coordinating acidic residue (Fig. 7A, D196 in A1tei).

Fig. 7: Structures of GPA M1 A-domains during GPA evolution from paleomycin to Van/Pek.
figure 7

AD Comparison of substrate binding pockets shown in light blue sticks with residues that have changed between ancestors shown in colour marking ancestral node in phylogeny of Fig. 6 (above) Bound Hpg substrate shown in black for A1tei (which has the same specificity pocket as ANC1) and A1ANC3 (Leu). No substrate bound structure was obtained for A1ANC2 and ANC4, with substrates shown in stippled lines and greyed out to show an approximate position of bound substrate. EG Superposed and contoured pocket surfaces of A1tei, A1ANC3 and ANC4 compared with A1ANC2, the intermediate pocket, in two views to show how the changes altered the pocket size and shape (A1tei – slate blue, A1ANC2 – brown, A1ANC3 – orange, ANC4 – turquoise).

Full size image

To understand the effects of the evolutionary mutations seen in the binding pocket of the ANC A-domains we then mutated the residues of the substrate binding pocket of A1tei to those seen in A1ANC2 and A1ANC3. The structures of these mutant domain Acore constructs were determined to 2.69 Å and 1.88 Å resolution, respectively (PDB IDs 8GJP and 8GKM; Table S6, Supplementary Fig. S48). Comparison of the A1ANC2 pocket with the A1tei/ANC1 Hpg pocket revealed that the amino acid substitutions contribute both to prevent Hpg binding and opening the pocket to accept proteinogenic amino acids instead of Hpg (Fig. 7B). The substitution H237Y is key in this process, as it removes H-bonding to the 4-hydroxyl group of Hpg and narrows the pocket to exclude the planar Hpg residue (Fig. 7B). The L295V mutation then opens a cavity that allows the binding of proteinogenic amino acids with a 109° angle at the β carbon. These two key substitutions (H237Y and L295V) enabled binding of new substrates and potentiated the binding pocket for subsequent refinement and partition of substrate binding mode. Comparing the A1ANC3 pocket to A1ANC2 shows further refinement of the pocket through additional substitutions (L287M and V295M), which serve to improve van der Waals contacts in the binding pocket for Leu (Fig. 7C, Supplementary Fig. S49). Finally, we crystallised the D-Ala accepting ANC4 domain as an Acore construct to 3.12 Å resolution (PBD ID 8GLC) because of the instability of this pocket graft into A1tei (Fig. 7D). In this structure, the substrate binding pocket accommodating the small D-alanine is mainly formed by the three residues D196 (numbering for comparison to other structures; D190 in crystal structure), A197 (A191) and L287 (L281). The reduction in pocket size is largely attributed to the C296W (W290) substitution, with the increase in bulk displacing L287 (L281) and shifting the residue upwards by ~3 Å forming the lower surface of the pocket. While Y237 (Y231) is hydrogen bonded to the main chain carbonyl of G263 (G257) in the A1ANC2 and A1ANC3 pocket, this residue is rotated downwards in ANC4. This rotation appears to be facilitated by a hydrogen bond between the Y237 and the L261Q (Q255) substitution. The new Y237 (Y231) rotamer displaces F200 (F194) and allows for the loop and β-sheet in which G263 (G257) resides to shift downwards, thus further limiting the pocket surface. The ability to interrogate the stepwise changes in the GPA NRPS via mutation of A1 provides important insights into their evolutionary process, showing that the switch from Hpg (ANC1) to Leu (ANC3) did not result in an intermediate pocket (ANC2) capable of activating both residues. Instead, this was a direct selectivity switch that was subsequently refined to provide the modern Van-type GPA A1-domains.

Discussion

Here, we were able to show that all lipid II targeting GPAs evolved from a common ancestor, predict, and produce the peptide core of this common ancestor and revive an ancestral antibiotic. In doing so, we were able to follow the evolutionary history of GPA diversification on the molecular level and understand both which changes occurred in the past and when these took place. In this regard, understanding the evolution of GPAs is exemplary since they retain a common mechanism of action whilst displaying significant differences in their core peptide structures. Our results show that the ancestor of all lipid-II targeting GPAs, which we term paleomycin, is predicted to possess a complex, tetracyclic heptapeptide core comprised entirely of aromatic residues. These results are in agreement with the results of ref. 14, although different approaches were used to examine the evolution of these molecules. Paleomycin therefore resembles a modern Tei-type GPA, implying that tricyclic, Van-type GPAs containing aliphatic residues have evolved from a more complex precursor. In doing so, evolution of the peptide backbone has been driven by the alteration of the peptide producing NRPS assembly line, with effective concurrent alterations in the first 3 modules (M1-3) of the NRPS, the loss of the 4th P450 cyclisation enzyme (OxyE) and replacement of Bht generation on the main NRPS (Tei-type) with an offline mechanism for Bht formation (Van-type). Whilst the exact order of these steps can't be precisely defined, biochemical experiments suggest that the presence of OxyE—even in an inactive form—is required to effectively crosslink a peptide containing a Dpg-3 residue36, suggesting these changes were tightly coupled. Alteration of Bht incorporation could help overcome a lack of A-domain fidelity in M637, and prevent the synthesis of GPAs lacking the β-OH group, a crucial attachment site for post-NRPS modification in modern GPAs.

Analysis of the changes in the NRPS from a putative tetracyclic GPA to a tricyclic one indicates the adoption of two reengineering strategies in the evolution of these antibiotics. Our results revealed that both recombination of assembly lines and point mutation of A-domains have occurred during GPA evolution. The replacement of the ancestral GPA M3 by an Asn-encoding module via recombination was observed simultaneously with the fusion event involving M2 and M3. This observation is consistent with experimental evidence that the fusion of M2 and M3 within natural GPA assembly lines results in the subsequent loss of peptide extension29. This suggests that GPA evolution overcame the loss of natural inter-module affinity by module fusion. In terms of A1-domain evolution, our results indicate how the alteration of the Hpg binding pocket was driven through three crucial changes in the binding pocket that together eliminated Hpg binding whilst enabling the acceptance of Leu. Optimising mutations then provided the route to the modern M1 A-domains that we see in Van-type GPAs, with Ala-activating domains undergoing further major reengineering to finally activate D-Ala. Structural snapshots of this process demonstrated how major changes in substrate can be accommodated in A-domains, and will no doubt prove key to inform targeted engineering strategies to produce new molecules.

Perhaps the most intriguing question is why GPAs have evolved towards a Van-type scaffold given the limited direct influence that this has had on GPA activity. One possibility is that modern GPAs could have been optimised to generate favourable secondary interactions to further interfere with bacterial cell wall biosynthesis, although Tei-type scaffolds appear to be more robust in this regard38. Alternatively, modern GPAs offer a route to improve the titer by tapping into the cellular pool of amino acids more than relying on the shikimate pathway for the supply of building blocks. Indeed, paleomycin requires seven shikimate pathway-derived building blocks whilst vancomycin/pekiskomycin only five39. Simplifying the GPA while retaining its specific biological activity might reduce production costs and increase fitness by saving important energetic and molecular resources that can be used for growth and primary metabolism.

In the context of metabolic engineering efforts, this reminds us that from the perspective of the producer any step along the entire biosynthetic apparatus can be optimised and that when engineering assembly lines, the suitable provision of precursors also must be addressed to make these assembly lines effective at producing new peptides40. Taken together, our study has revealed paleomycin as the predicted ancestor of all lipid-II targeting GPAs, showing how modern Van-type GPAs have evolved from this Tei-type scaffold through recombination and mutation of the peptide-producing NRPS together with the insertion and deletion of genes encoding precursor generating and peptide modifying enzymes. Our study demonstrates the power of ancestral state reconstruction combined with biochemical and structural approaches to delve into the evolutionary past of antibiotic biosynthesis and understand how changes at the DNA level translate into changes at the enzymatic level to increase peptide diversity. In this way, the evolutionary past of GPAs provides vital insights into the evolutionary mechanisms of secondary metabolism, gives rise to new "ancestral" molecules, and explains how nature's largest chemical diversity unfolded.

Methods

Computational techniques

Ancestral sequence reconstruction of glycopeptide antibiotic (GPA) DNA

Sequences from all GPA BGCs available by May 2018 (from full genomes and the chloroeremomycin plasmid sequences, see Supplementary Table S1 for strains and accession numbers) were used for the ancestral sequence reconstruction. GPA BGCs were identified using antiSMASH 441. While the actual sequence reconstruction was performed based on nucleotide sequences, the guide tree for ancestral sequence reconstruction was base...

Comments

Popular posts from this blog

Rashes that look like scabies: Causes, symptoms, and treatment - Medical News Today

Urinary Tract Infection (UTI): Causes, Symptoms & Treatment - Cleveland Clinic

Symtuza: Uses, side effects, alternatives, and more - Medical News Today