Summary

प्रोटीन की रायः के लिए एक कार्यक्षेत्र<em> सिलिको में</em><em> डी नोवो</emBiomolecules के> डिजाइन

Published: July 25, 2013
doi:

Summary

We developed computational de novo protein design methods capable of tackling several important areas of protein design. To disseminate these methods we present Protein WISDOM, an online tool for protein design (http://www.proteinwisdom.org). Starting from a structural template, design of monomeric proteins for increased stability and complexes for increased binding affinity can be performed.

Abstract

The aim of de novo protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.

To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.

Introduction

De novo protein design is the identification of protein sequences that will yield a desired tertiary structure with improved properties or function. Since the native fold of a protein is the conformation which lies at the free energy minimum, de novo protein design seeks sequences that will have a free energy minimum in the target fold. This problem was first described by Drexler1 and Pabo2 and was referred to as the “inverse folding problem.” However, unlike the protein folding problem, where a sequence can yield only one folded structure solution, the de novo protein design problem exhibits degeneracy. Many different amino acid sequences can yield the same tertiary structure and function.

While protein design has traditionally been performed experimentally through rational design and directed evolution, computational methods have more recently been employed to overcome the limited search space inherent in experimental methods. A variety of computational methods have been used, including deterministic methods, stochastic methods, and probabilistic methods.3,4 Early computational methods used fixed-backbone templates to make the problem easier to solve.5-7 With the advent of faster processors, high performance computing, and more efficient algorithms, backbone flexibility has been incorporated by using an ensemble of fixed-backbone templates8-14 or by incorporating true backbone flexibility by expressing the template in terms of ranges of atom-to-atom distances and dihedral angles.15,16

This paper describes in detail Protein WISDOM, an online tool that has been made available to the academic community to utilize our computational de novo protein design framework. This framework has been applied to the design of numerous proteins, for therapeutic use targeting diseases such as HIV, cancer, complement diseases, and other autoimmune disorders. Many of the predicted peptides were experimentally validated, demonstrating the power of the method. Table 1 provides a summary of the different proteins that have been designed including the size of the protein or peptide, the number of predictions, and experimental validation.

Protein Design Protein Length # of Computational Predictions # of Experimental Validations Reference
Full sequence design of human beta-defensin-2 41 340   (17)
Compstatin inhibitors of human C3 13 28 3/3 (18, 19)
Compstatin analogues that bind to rat C3c 13 5   (20)
Compstatin analogues with di-serine extension 15 8    
Stabilizing structure of compstatin analog W4A9 13 18    
C3a receptor agonists and antagonists 77 20 4/7 (21)
C5a receptor agonists and antagonists 74 61 2/61  
HIV-1 gp14 inhibitors 12 6 4/5 (22)
HIV-1 gp120 inhibitors 9 14    
Bak inhibitors of Bcl-x L and Bcl-2 16-18 10 5/5 (23)
Inhibitors of ERK2 11 25    
Inhibitors of EZH2 21 17 10/10 (24)
Inhibitors of LSD1 and LSD2 16 41 17/20  
Inhibitors of HLA-DR1 13 6   (25)
Inhibitors of PNP 5 13    

Table 1. Summary of designed proteins and peptides using the de novo protein design framework. The # of computational predictions is presented as the number of favorable predictions (i.e. fold specificities above a certain cutoff or approximate binding affinities greater than the native sequence). The # of experimental validations gives two numbers: the first is the number of predictions that were experimentally validated while the second is the total number of predictions that were tested experimentally.

Design of human-beta-defensin-2 (hβD-2) was performed to enhance the peptide’s antimicrobial property.17 For this design, we considered two cases: 1) up to 10 mutations along hβD-2 and 2) full sequence design of all hβD-2 residue positions except the Cysteines (8, 15, 20, 30, 37, and 38). Three different design templates and three different sequence selection models were utilized in the design. High levels of similarity in mutations were observed between the weighted average and distance bin models for both the 10 mutation design and the full sequence design. Additionally, a large number of sequences were found to have more favorable calculated Fold Specificity values than the native sequence.

Complement system inhibitors (of C3, C3a, and C5a) were designed to combat a number of immune diseases such as stroke, heart attack, Alzheimer’s disease, asthma, rheumatoid arthritis, rejection of xenotransplantation, adult respiratory disease, psoriasis, and Crohn’s disease. Three compstatin inhibitors of C3c predicted by the protein design framework plus three rationally designed sequences were experimentally validated to be better binders than the native compstatin.18,19

Further studies examined the loss of activity of compstatin against non-primate C3c and designed a number of candidate rat and mouse C3c inhibitors. Five sequences were shown to have more favorable association free energies with rat C3c than the W4A9 compstatin mutant known to inhibit C3c. This is due to a new salt bridge formation by Arg1.20 Eight sequences with an N-terminal extension were predicted to be better binders than W4A9 with a di-Serine extension. Finally, 18 compstatin sequences were predicted to stabilize the bound conformation of W4A9, providing strong candidates for primate and non-primate C3c inhibitors.

In addition to C3c inhibitors, C3a and C5a receptor agonists and antagonists were designed based upon the structures of C3a and C5a. Seven C3a sequences predicted by the model were experimentally tested. Two of the sequences were potent agonists while two others were partial agonists.21 The two potent agonists showed a 58-fold improvement over a previously discovered “superagonist”. The design of C5a receptor agonists and antagonists provided a set of 61 sequences. All the sequences were synthesized and two were found to be novel C5a agonists.

Fusion inhibitors of HIV-1, the virus that causes AIDS, were designed to prevent HIV-1 from infecting cells. The first design targeted gp41, an envelope glycoprotein of HIV-1. The protein design framework predicted six sequences that were better binders than the native sequence. Four of these predicted sequences were experimentally validated to inhibit HIV-1 with the best sequence having an IC50 as low as 29 μM. This sequence showed a 3-15 fold improvement over the native sequence and had no loss of activity against an Enfuvirtide-resistant virus strain.22 The second design targeted gp120, another envelope glycoprotein of HIV-1. Fourteen sequences were predicted to be binders of gp120 and provide additional potential fusion inhibitors of HIV-1.

Numerous proteins linked to cancer provided promising targets for cancer therapeutics. Bcl-2 and Bcl-xL are anti-apoptotic proteins that prevent cell death. Inhibitors of these two proteins were designed to induce cell death in cancer cells. Ten sequences were predicted to be better binders than the native, and these results captured previous experimental and mutagenesis results.23 Another target protein, ERK2, is involved in signal-transduction cascades that make it a promising target for antiproliferative cancer therapies. Twenty-five sequences were predicted to be inhibitors of ERK2.

Histone methyltransferases and demethylases dynamically control histone methylation, which has been linked to many cancer types including prostate, breast, lymphoma, myeloma, bladder, colon, skin, liver, endometrial, lung, and gastric. The de novo protein design framework identified 17 inhibitors of EZH2 (a Lysine methyltransferase) and of the ten experimentally tested, all were found to inhibit EZH2.24 The most potent peptide had an IC50 of about 13 μM, was equally effective with elevated enzyme concentrations, and did not compete with the cofactor. These peptides were the first set of inhibitors of EZH2. 53 inhibitors of LSD1 (a demethylase) were predicted by the framework and of the 20 experimentally tested, 17 were inhibitors of LSD1 and 18 were inhibitors of LSD2. The best inhibitors had IC50 values below 1 μM, making them the most potent peptidic inhibitors discovered to date.

The final two protein systems provided targets for treating various autoimmune diseases such as Coeliac disease, diabetes mellitus type 1, systemic lupus erythematosus, Sjögren’s syndrome, Churg-Strauss Syndrome, Hashimoto’s thyroiditis, Graves’ disease, idiopathic thrombocytopenic purpura, rheumatoid arthritis, and allergies. None of these potential inhibitors have been experimentally validated, however the framework predicted six sequences that bind to HLA-DR1 and 13 sequences that bind to PNP.

Table 2 summarizes experimentally validated inhibitors and agonists predicted using the de novo protein design framework. The approximate binding affinity metric was used to predict nine of the sequences (inhibitors of human C3c, HIV-1 gp41, EZH2, LSD1, and LSD2), while the fold specificity metric was used to identify four of the sequences (agonists/antagonists of C3aR). These peptides highlight the success of the de novo protein design framework, particularly the added approximate binding affinity metric. The framework is extremely versatile in its applicability. Six different proteins linked to twenty-five different diseases have been successfully designed and experimentally validated.

Name IC50 EC50 Protein Target Applicable Diseases
SQ027 0.94 μM   human C3c stroke, heart attack, Alzheimer’s disease, asthma, rheumatoid arthritis, systemic lupus erythematosus, multiple sclerosis, psoriasis, diabetes type I, Crohn’s disease, pancreatitis, and cystic fibrosis
SQ086 1.98 μM   human C3c
SQ059 4.73 μM   human C3c
SQ110-4   15.2 nM C3aR
SQ060-4   36.4 nM C3aR
SQ007-5 15.4 nM   C3aR
SQ002-5 26.1 nM   C3aR
SQ435 29 – 253 μM   HIV-1 gp41 AIDS
SQ037 13.57 μM   EZH2 prostate, breast, lymphoma, myeloma, bladder, colon, skin, liver, endometrial, lung, and gastric cancers
SQ011-1 0.521 μM   LSD1
SQ016-1 0.249 μM   LSD1
SQ026-1 2.51 μM   LSD2
SQ015-1 1.332 μM   LSD2

Table 2. Computationally predicted and experimentally validated peptides targeting various diseases.

Protocol

Method Overview The de novo design framework used in Protein WISDOM consists of two stages. The first stage produces a rank-ordered list of amino acid sequences that will fold into a given template structure. The second stage validated these sequences by calculating either fold specificity or approximate binding affinity, or both. The former is primarily used when the design is of a single protein, while the latter is used when the design is of a complex (a peptide binding to a target pr…

Representative Results

De Novo Design of Entry Inhibitors for HIV-1 The de novo design framework implemented in Protein WISDOM has been used for the design of inhibitor peptides for several important therapeutic systems (Tables 1 and 2). One system of note is the design of peptides to inhibit HIV-1 entry to the host cell receptor CD4, which is here used as a representative system to demonstrate the practical use of the Protein WISDOM interface. The peptides were designed to target the…

Discussion

The de novo protein design framework consists of two stages, a sequence selection stage and a validation stage. The framework is robust enough to handle rigid and flexible design templates, and can be applied to single protein design or complex protein design. The framework has been successfully applied to numerous protein systems with applications to dozens of diseases. A number of the designs have been experimentally validated, providing the most potent inhibitors or agonists of some proteins discovered to dat…

Divulgazioni

The authors have nothing to disclose.

Acknowledgements

CAF gratefully acknowledges support from NSF, NIH (R01 GM52032; R24 GM069 736), and the US Environmental Protection Agency, EPA (R 832721-010). A portion of this research was made possible with Government support by DoD, Air Force Office of Scientific Research. JS gratefully acknowledges support from NIH (P50GM071508-06). MLBP gratefully acknowledges support from a National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a. GAK gratefully acknowledges support from a National Science Foundation Graduate Research Fellowship under grant number DGE-1148900.

Riferimenti

  1. Drexler, K. Molecular engineering: An approach to the development of general capabilities for molecular manipulation. Proc. Natl Acad. Sci. U.S.A. 78, 5275-5278 (1981).
  2. Pabo, C. Molecular technology: Designing proteins and peptides. Nature. 301, 200 (1983).
  3. Floudas, C. A. Research challenges, opportunities and synergism in systems engineering and computational biology. AIChE J. 51, 1872-1884 (2005).
  4. Fung, H. K., Welsh, W. J., Floudas, C. A. Computational de novo peptide and protein design: Rigid templates versus flexible templates. Ind. Eng. Chem. Res. 47, 993-1001 (2008).
  5. Ponder, J., Richards, F. Tertiary templates for proteins. J. Mol. Biol. 193, 775-791 (1987).
  6. Dahiyat, B. I., Mayo, S. L. Protein design automation. Protein Sci. 5, 895-903 (1996).
  7. Dahiyat, B. I., Gordon, D. B., Mayo, S. L. Automated design of the surface positions of protein helices. Protein Sci. 6, 1333-1337 (1997).
  8. Su, A., Mayo, S. L. Coupling backbone flexibility and amino acid sequence selection in protein design. Protein Sci. 6, 1701-1707 (1997).
  9. Desjarlais, J., Handel, T. Side chain and backbone flexibility in protein core design. J. Mol. Biol. 290, 305-318 (1999).
  10. Farinas, E., Regan, L. The de novo design of a rubredoxin-like Fe site. Protein Sci. 7, 1939-1946 (1998).
  11. Harbury, P. B., Plecs, J. J., Tidor, B., Alber, T., Kim, P. S. High-resolution protein design with backbone freedom. Science. 282, 1462-1467 (1998).
  12. Koehl, P., Levitt, M. De novo protein design: I. In search of stability and specificity. J. Mol. Biol. 293, 1161-1181 (1999).
  13. Koehl, P., Levitt, M. De novo protein design. II. Plasticity in sequence space. J. Mol. Biol. 293, 1183-1193 (1999).
  14. Kuhlman, B., Dantae, G., Ireton, G., Verani, G., Stoddard, B., Baker, D. Design of a novel globular protein fold with atomic-level accuracy. Science. 302, 1364-1368 (2003).
  15. Klepeis, J. L., Floudas, C. A. Integrated structural, computational and experimental approach for lead optimization: Design of compstatin variants with improved activity. J. Am. Chem. Soc. 125, 8422-8423 (2003).
  16. Klepeis, J. L., Floudas, C. A., Morikis, D., Tsokos, C. G., Lambris, J. D. Design of peptide analogs with improved activity using a novel de novo protein design approach. Ind. Eng. Chem. Res. 43, 3817-3826 (2004).
  17. Fung, H. K., Floudas, C. A., Taylor, M. S., Zhang, L., Morikis, D. Toward full-sequence de novo protein design with flexible templates for human beta-defensin-2. Biophys. J. 94, 584-599 (2008).
  18. Bellows, M. L., Fung, H. K., Floudas, C. A., López de Victoria, A., Morikis, D. New compstatin variants through two de novo protein design frameworks. Biophys. J. 98, 2337-2346 (2010).
  19. López de Victoria, A., Gorham, R. D. A new generation of potent complement inhibitors of the compstatin family. Chem. Biol. Drug Des. 77, 431-440 (2011).
  20. Tamamis, P., López de Victoria, A. Molecular dynamics in drug design: New generations of compstatin analogs. Chem. Biol. Drug Des. 79, 703-718 (2012).
  21. Bellows-Peterson, M. L., Fung, H. K. De novo peptide design with c3a receptor agonist and antagonist activities: Theoretical predictions and experimental validation. J. Med. Chem. 55, 4159-4168 (2012).
  22. Bellows, M. L., Taylor, M. S. Discovery of entry inhibitors for HIV-1 via a new de novo protein design framework. Biophys. J. 99, 3445-3453 (2010).
  23. Sun, J. -. J., Abdeljabbar, D. M., Clarke, N. L., Bellows, M. L., Floudas, C. A., Link, A. J. Reconstitution and engineering of apoptotic protein interactions on the bacterial cell surface. J. Mol. Biol. 394, 297-305 (2009).
  24. Smadbeck, J., Bellows-Peterson, M. L. De novo protein design and validation of histone methyltranferase inhibitors. , (2013).
  25. Bellows, M. L., Fung, H. K., Floudas, C. A., Adjiman, C. S., Galindo, A. . Molecular Systems Engineering, Process Systems Engineering. 6, 207-232 (2010).
  26. Rajgaria, R., McAllister, S. R., Floudas, C. A. A novel high resolution Cα-Cα distance dependent force field based on a high quality decoy set. Proteins. 65, 726-741 (2006).
  27. Rajgaria, R., McAllister, S. R., Floudas, C. A. Distance dependent centroid to centroid force fields using high resolution decoys. Proteins. 70, 950-970 (2008).
  28. Fung, H. K., Taylor, M. S., Floudas, C. A. Novel formulations for the sequence selection problem in de novo protein design with flexible templates. Optim. Method. Softw. 22, 51-71 (2007).
  29. Fung, H. K., Rao, S., Floudas, C. A., Prokopyev, O., Pardalos, P. M., Rendl, F. Computational comparison studies of quadratic assignment like formulations for the in silico sequence selection problem in de novo protein design. J. Comb. Optim. 10, 41-60 (2005).
  30. CPLEX. . Using the CPLEX Callable Library. , (1997).
  31. Klepeis, J. L., Floudas, C. A. Free energy calculations for peptides via deterministic global optimization. J. Chem. Phys. 110, 7491-7512 (1999).
  32. Klepeis, J. L., Floudas, C. A., Morikis, D., Lambris, J. D. Predicting peptide structures using NMR data and deterministic global optimization. J. Comput. Chem. 20, 1354-1370 (1999).
  33. Klepeis, J. L., Schafroth, H. D., Westerberg, K. M., Floudas, C. A. Deterministic global optimization and ab initio approaches for the structure prediction of polypeptides, dynamics of protein folding and protein-protein interactions. Adv. Chem. Phys. 120, 265-457 (2002).
  34. Klepeis, J. L., Floudas, C. A. Ab initio prediction of helical segments of polypeptides. J. Comput. Chem. 23, 246-266 (2002).
  35. Klepeis, J. L., Floudas, C. A. Prediction of beta-sheet topology and disulfide bridges in polypeptides. J. Comput. Chem. 24, 191-208 (2003).
  36. Klepeis, J. L., Floudas, C. A. ASTRO-FOLD: A combinatorial and global optimization framework for ab initio prediction of three-dimensional structures of proteins from the amino acid sequence. Biophys. J. 85, 2119-2146 (2003).
  37. Klepeis, J. L., Pieja, M. T., Floudas, C. A. A new class of hybrid global optimization algorithms for peptide structure prediction: Integrated hybrids. Comput. Phys. Commun. 151, 121-140 (2003).
  38. Klepeis, J., Pieja, M., Floudas, C. Hybrid global optimization algorithms for protein structure prediction : Alternating hybrids. Biophys. J. 84, 869-882 (2003).
  39. Klepeis, J. L., Floudas, C. Analysis and prediction of loop segments in protein structures. Comput. Chem. Eng. 29, 423-436 (2005).
  40. Mo¨nnigmann, M., Floudas, C. Protein loop structure prediction with flexible stem geometries. Proteins. 61, 748-762 (2005).
  41. McAllister, S. R., Mickus, B. E., Klepeis, J. L., Floudas, C. A. A novel approach for alpha-helical topology prediction in globular proteins: Generation of interhelical restraints. Proteins. 65, 930-952 (2006).
  42. Floudas, C. A., Fung, H. K., McAllister, S. R., Mönnigmann, M., Rajgaria, R. Advances in protein structure prediction and de novo protein design: A review. Chem. Eng. Sci. 61, 966-988 (2006).
  43. Subramani, A., Wei, Y., Floudas, C. A. ASTRO-FOLD 2.0: An enhanced framework for protein structure prediction. AIChE J. 58, 1619-1637 (2012).
  44. Wei, Y., Thompson, J., Floudas, C. Concord: a consensus method for protein secondary structure prediction via mixed integer linear optimization. P. Roy. Soc. A-Math. Phy. 468, 831-850 (2011).
  45. Subramani, A., Floudas, C. β-sheet topology prediction with high precision and recall for β and mixed α/β proteins. PLoS One. 7, e32461 (2012).
  46. Rajgaria, R., Wei, Y., Floudas, C. A. Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins. 78, 1825-1846 (2010).
  47. Subramani, A., Floudas, C. A. Structure prediction of loops with fixed and flexible stems. J. Phys. Chem. B. 116, 6670-6682 (2012).
  48. Güntert, P., Mumenthaler, C., Wüthrich, K. Torsion angle dynamics for NMR structure calculation with the new program DYANA. J. Mol. Biol. 273, 283-298 (1997).
  49. Güntert, P. Automated NMR structure calculation with CYANA. Methods Mol. Biol. 278, 353-378 (2004).
  50. Ponder, J. TINKER, software tools for molecular design. , (1998).
  51. Cornell, W. D., Cieplak, P. A 2nd generation forcefield for the simulation of proteins, nucleic acids, and organic molecules. J. Am. Chem. Soc. 117, 5179-5197 (1995).
  52. Lilien, R. H., Stevens, B. W., Anderson, A. C., Donald, B. R. A novel ensemble-based scoring and search algorithm for protein redesign and its application to modify the substrate specificity of the gramicidin synthetase a phenylalanine adenylation enzyme. J. Comput. Biol. 12, 740-761 (2005).
  53. Lee, M. R., Baker, D., Kollman, P. A. 2.1 and 1.8 A°Cα RMSD structure predictions on two small proteins, HP-36 and S15. J. Am. Chem. Soc. 123, 1040-1046 (2001).
  54. Rohl, C. A., Baker, D. De novo determination of protein backbone structure from residual dipolar couplings using rosetta. J. Am. Chem. Soc. 124, 2723-2729 (2002).
  55. Rohl, C. A., Strauss, C. E. M., Misura, K. M. S., Baker, D. Protein structure prediction using rosetta. Methods Enzymol. 383, 66-93 (2004).
  56. DiMaggio, P. A., McAllister, S. R., Floudas, C. A., Feng, X. J., Rabinowitz, J. D., Rabitz, H. A. Biclustering via optimal re-ordering of data matrices in systems biology: Rigorous methods and comparative studies. BMC Bioinformatics. 9 (458), (2008).
  57. DiMaggio, P. A., McAllister, S. R., Floudas, C. A., Feng, X. J., Rabinowitz, J. D., Rabitz, H. A. A network flow model for biclustering via optimal re-ordering of data matrices. J Global Optimization. 47, 343-354 (2010).
  58. Daily, M. D., Masica, D., Sivasubramanian, A., Somarouthu, S., Gray, J. J. CAPRI rounds 3-5 reveal promising successes and future challenges for RosettaDock. Proteins. 60, 181-186 (2005).
  59. Gray, J. J., Moughon, S., et al. Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J. Mol. Biol. 331, 281-299 (2003).
  60. Gray, J. J., Moughon, S. E., et al. Protein-protein docking predictions for the CAPRI experiment. Proteins. 52, 118-122 (2003).
  61. Kuhlman, B., Baker, D. Native protein sequences are close to optimal for their structures. Proc. Natl Acad. Sci. U.S.A. 97, 10383-10388 (2000).
check_url/it/50476?article_type=t

Play Video

Citazione di questo articolo
Smadbeck, J., Peterson, M. B., Khoury, G. A., Taylor, M. S., Floudas, C. A. Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules. J. Vis. Exp. (77), e50476, doi:10.3791/50476 (2013).

View Video