JOURNAL OF MEDICINAL CHEMISTRY Copyright 199 by the American Chemical Society Volume 37,Number 8 Apr115.1994 Perspective Application of the Three-Dimensional Structures of Protein Target Molecules in Structure-Based Drug Design Jonathan Greer,'.t John W.Erickson,John J.Baldwin,and Michael D.Varney 702 gra Received January 26,1994 athu cally begins with the natural and as the produceac mpound trate of this protens edto produce ars,the disco ery of nev gs ha of what is known about the ds to derive a str utic m01 tio s either wer it is helpful to av three- the bio tive conforma tdedprojlectserefreauent Exp iencehas taught that this conform tion is not the ature mpoun whe ed tria itoitsreceptororenzyTmeaG ve site rror and medicina stry exp ience and int hemist to mo uctive to produceno This ign of 0g8 Quantitative tr e activit a cr OgADYA-1 nd more recent 3D QSAR and advance the of nev tical Products Divisior (Figure 1,blue cycle) try. eraction target in comple with the ligand.Thisould alow the 0022-2623/94/1837-1035804.50/0 @1994 American Chemical Society
JOURNAL OF MEDICINAL CHEMISTRY 8 Copyright 1994 by the American Chemical Society Volume 37, Number 8 April 15, 1994 Perspective Application of the Three-Dimensional Structures of Protein Target Molecules in Structure-Based Drug Design# Jonathan Greer,*i+ John W. Erickson,t John J. Baldwin,$ and Michael D. Varneyl Department of Structural Biology, Pharmaceutical Products Division, Abbott Laboratories, Abbott Park, Illinois 60064, Structural Biochemistry Program, PRIIDynCorp, NCI-Frederick Cancer, Research and Development Center, Frederick, Maryland 21 702, Department of Medicinal Chemistry, Merck Research Laboratories, West Point, Pennsylvania 19486, and Department of Medicinal Chemistry, Agouron Pharmaceuticals, San Diego, California 92121 Received January 26, 1994 While the chemist of today faces many exciting and stimulating challenges, perhaps the most demanding, promising, and rewarding one is the rational design of novel therapeutic agents for the treatment of human diseases. For many years, the discovery of new drugs has been achieved by taking a lead structure and iterating cycles of new compound synthesis with biological testing of those compounds to derive a structure-activity relationship related to some measure of therapeutic efficacy (Figure 1, black cycle).l Initial lead structures either were the natural ligand or were discovered in a random screening program of compounds or fermentation beers using in vitro or even in vivo tests.2 Indeed, projects were frequently chosen on the basis of the results of random screening. Sometimes the leads came from literature compounds. Analogs were selected for synthesis during the iteration cycle on the basis of a combination of inspired trial and error and medicinal chemistry experience and intuition. In recent years, rational drug design has emerged more widely in the pharmaceutical ind~stry.~ This approach requires selecting a protein target molecule which plays a critical role in a physiologically relevant biological #This article is based upon an American Chemical Society satellite television presentation entitled "Macromolecular modeling in the discovery of new drugs", broadcast March 6, 1993. t Department of StruduralBiology, PharmaceuticalProductsDivision, Abbott Laboratories, Abbott Park, IL 60064. f Structural Biochemistry Program, PRI/DynCorp, NCI-Frederick Cancer, Research and Development Center, Frederick, MD 21702. f Department of Medicinal Chemistry, Merck Research Laboratories, West Point, PA 19486. I Department of Medicinal Chemistry, Agouron Pharmaceuticals, San Diego, CA 92121. 0022-262319411837-1035$04.50/0 pathway. The chemist typically begins with the natural ligand as the lead and modifies it to produce a compound with the desired properties. The natural ligand or substrate of this protein is manipulated to produce an enzyme inhibitor or an agonist or antagonist for a receptor, depending upon the identified therapeutic need, capitalizing upon knowledge of what is known about the mechanism of action of the protein-ligand complex. In order to allow the chemist to more fruitfully design modifications of the lead structure, it is helpful to have a three-dimensional structure for the bioactive conformation of the ligand as it binds to the receptor or enzyme. Experience has taught that this conformation is not the solution structure, nor is it the crystal structure of the ligand. Rather it is the conformation of the ligand when it is bound to its receptor or enzyme active site. Knowledge of the bioactive conformation should better permit the chemist to modify analogs constructively to produce novel structures that are potent and specific. A number of methods have been developed to help in the selection and design of better analogs. Quantitative structure activity relations (QSAR),4pharmacophore or receptor mapping,6t8 and more recent 3D QSAR methods, such as CoMFA? have emerged to aid in the discovery of the bioactive conformation and advance the analog design process (Figure 1, blue cycle). Beyond knowledge of the bioactive conformation of the ligand, it would be valuable to understand the detailed interactions of the ligand with its receptor protein by examining the three-dimensional structure of the protein target in complex with the ligand. This would allow the 0 1994 American Chemical Society
1036 Journal of Medicinal Chemistry,1994,Vol.37,No.8 Perspective chestheicerctions the tolgiotlbiochemitndturalbiohogitiaesential he op ry sites,rest in better poten the above topractic an wo d othe The eif mples oietieg Structural principle method ogy is now deas hi to the rap ign of a ove of very everal itert ons of the chabyrytalliti and structure otionFo cy of carbo . example des ribes the de nou ia to the stru einhibitor theedicenioalstruet des compounds which are currently in clinical trials Sometimes wher rotei a the the pr n ca ycle). bind ing of the Background One of the most active areas of dru rld data orts t s appro ever more appc earch for safe and effec ve the pies for aids ha structure provide ag rch eff on the whi of this dis designs early in drug-design project. sedandproteinmodel-b sed de hatstrategie8noWeitfo nds and biological testing encoded enzyme that cleav ira re 1.bu analysi g and gag-pot mar ily by the ne rest nt de rated that HIV PR per ms an function in the life nermit properties of thes using tion to design ney return it toth e,an emely fast-moving field;there are well 60 If th potent fo Ift est ofa.the ure ca d tes ing.This can be ns for Inhi her ro ds of This ver of cycle ed the y prop sal that these enzyme e relate gde ign and is the The latte are bilobal ngle-chain enzymes
1036 Journal of Medicinal Chemistry, 1994, Vol. 37, No. 8 chemist to preserve the critical interactions with the protein, while modifying the ligand or substrate analog to interact more precisely with the receptor or enzyme and to occupy subsidiary sites, resulting in better potency and specificity. With the structure of the target protein-ligand complex, one can better understand the structureactivity relationships of existing compounds, suggest new analogs to synthesize in current series, and develop novel concepts and ideas for completely new ligand moieties. This methodology is now known as structure-based drug design. Obtaining the experimental three-dimensional structure of the protein target frequently requires considerable effort to clone the appropriate DNA, express, purify, and characterize multi-milligram amounts of the protein, followed by crystallization and structure solution.8 For NMR studies, l5N and 13C isotope labeling is frequently crucial to the structure determinati~n.~ Not only is the three-dimensional structure of the protein target desired, but especially complexes with the lead ligand compounds that are of interest to the chemist. Sometimes, when sufficient quantities of the protein are not readily available, the structure of the protein can be derived by homology modeling (Figure 1, green cycle).'OJ The binding of the ligand to the protein can also be examined in this way.12 The large protein structure databasel3 and the exploding protein sequence databases'"'6 are making this approach ever more applicable. While such models are not accurate in the details of their structure, they can provide a good, rapid, albeit approximate, view of the active site and its interactions with the ligand to the chemist to help generate new ideas and designs early in a drug-design project. Both the ligand-based and protein model-based design strategies can be iterated with chemical synthesis of the designed compounds and biological testing followed by further rounds of ligand or protein model analysis, synthesis, and biological testing (Figure 1, blue and green cycles). However, this process is driven primarily by the experimental data provided by the biological testing. From the structural perspective, one is always extrapolating from the original structural information, and the biological data do not easily permit correcting errors in the structural models. The existence of an experimental structure for the protein-ligand complex, besides being more accurate, allows one to go beyond examining the details of the binding site and using this information to design new analogs. It allows one to take the designed, synthesized, and assayed compound, return it to the crystal or NMR tube, and redetermine the structure of this complex with this new compound (Figure 1, red cycle). Thus, one can determine experimentally whether the design concept was structurally correct. If the molecule was potent, was it potent for the correct reasons built into the design? If the compound was weaker than expected, how and why did the design concept fail? Best of all, the new structure can be used as the basis for another round of analysis, design, synthesis, and compound testing. This process can be iterated with further rounds of design, synthesis, testing, and so on to ultimately produce potent and specific compounds. This version of the design cycle has proven to be the most powerful implementation of structurally based drug design and is the subject of this review. Close collaboration between the medicinal chemist, molecular modeler/theoretical chemist, pharmacologist, molecular Perspective biologist, biochemist, and structural biologist is essential to the optimal utilization of this design cycle. Three examples are presented here to illustrate how the above ideas are being successfully reduced to practice in a number of important cases leading to the design and synthesis of novel, more potent, and specific compounds than would otherwise have been achieved. The examples illustrate the value of structural information at different stages of the drug-design process. Structural principles and ideas led to the rapid design of a novel class of very potent and highly specific HIV protease inhibitors for the treatment of AIDS. Several iterations of the structurebased drug-design cycle were utilized to optimize the potency of carbonic anhydrase inhibitors for the treatment of glaucoma. The final example describes the de nouo design, followed by iterative cycles of optimization, of a dramatically new series of thymidylate synthase inhibitors which can serve as anticancer agents. The structure-based drug-design efforts in all of these three cases have led to compounds which are currently in clinical trials. Design and Structure of Symmetry-Based Inhibitors of HIV-1 Protease Background. One of the most active areas of drug discovery research in the world today concerns efforts to stem the tide of the AIDS pandemic. The world-wide search for safe and effective therapies for AIDS has prompted an intensive research effort on the structure and biology of the human immunodeficiency virus (HIV- 1) which is the causative agent of this disease. This research had led to the elucidation of a myriad of specific viral targets for drug discovery and design with the result that strategies now exist for targeting virtually every aspect of the viral life cycle.17 HIV-1 protease (HIV PR) is a virally-encoded enzyme that cleaves, or processes, viral gag and gug-pol protein precursors during virus assembly and maturation.18 In 1988, it was observed that deletion mutagenesis of the HIV PR gene resulted in the production of noninfectious, immature virus parti~1es.l~ This experiment demonstrated that HIV PR performs an essential function in the life cycle of HIV and thus makes this enzyme an important target for the design of specific antiviral agents for AIDS. Structure-based drug design'g was used to design potent and specific, CZ symmetry-based inhibitors of HIV PR and to optimize the pharmacologic properties of these compounds. This work ultimately led to the first structurebased clinical compound for this important antiviral target. Additional details of this work can be found in several recent review^.^^^^ The crystallography of HIV PR is an extremely fast-moving field; there are well over 150 published and unpublished crystal structures of various inhibitor complexes in existence today. Many of these structures as well as complexes with substrate-based and asymmetric peptidomimetic inhibitors have also been revie~ed.2~~~5 Structure-Function Considerations for Inhibitor Design. The initial observations that retroviral proteases contain the amino acid triplet, Asp-Thr(Ser)-Gly, and that their proteolytic activity could be inhibited by pepstatin led to the early proposal that these enzymes were related mechanistically to the aspartic protease family of enzyme~.~~~~~ The latter are bilobal, single-chain enzymes in which each lobe, or domain, contributes an aspartic acid residue to the active site.28 The active site itself is formed
Perspective Journal of Medicinal Chemistry.1994.Vol 37.No.8 1037 Literature leads Biological Testing Support Efforts Re Protein Mode based Cycle MoliecularBlolog rotein Tame Experimenta rotein cle (labeled Cycle"in black)begins t pro the exper ie of th which inical 。14 are per the mpl hich The th pu clos on of the m d an the it tion p s is ectiv re the pr the eh orre to and tes cal tes ery p at the interfac f the N.and C-do and exhibits PR by s rol lahoratories31-34 ofirmed that the vira at the protein backbone are home HIV PR e of The activ e site is forme proposed that the fomere are como of tw by the mer interface aPatieacidtothea iteaspa7 )prote biadigdeniobondooneaideatheeig by a pa
Perspective Journal of Medicinal Chemistry, 1994, Vol. 37, No. 8 1037 -port Efforts 3-0 struciure of Figure 1. Structure-based drug design cycle. The basic, traditional drug design cycle (labeled “Basic Cycle” in black) begins with a biological assay that tests for the therapeutic use of the compounds. This test is crucial in that it provides the experimental information (labeled in brown) that fuels the cycle. Lead compounds are either the natural ligand or the result of a random screening program or from the literature (highlighted in gray). Analogs are synthesized and tested. On the basis of the testing results, the chemist, using traditional medicinal chemistry, decides what compounds to prepare next, and these compounds are then tested for activity. This basic cycle continues until a satisfactory compound is produced which can be taken for preclinical studies. In structurebased drug design, a number of steps are added progressively as more information becomes available (shown as blue, green, and red cycles). When no information is available about the structure of the receptor, ligand-based strategies, including QSAR, pharmacophore or receptor mapping, and the more recent 3D QSAR methods can be applied (blue cycle).P7 On the basis of the results of these methods, structure analysis and compound design are performed which leqd to rationalization of existing SAR, new analog prediction, and de novo lead de~ign.~~-~l This process is more effective when a model three-dimensional structure can be produced for the receptor protein-ligand complex from known 3D structures and sequences of related proteins using homology modeling methods (green cycle).lOJ1 When the experimental crystal or NMR structure of the receptor protein-ligand complex can be determined, then the most powerful structural information can be brought to bear (red cycle). (The figure shows the steps necessary to obtain sufficient protein to perform the experimental structural studies, which frequently includes cloning and expression of the cDNA or gene, and protein purification and characterization.) The three-dimensional structure of the protein-ligand complex is examined in detail to understand the detailed interactions between ligand and protein. On the basis of an analysis of the structure using a variety of increasingly sophisticated computer-assisted drug design strategies and all medicinal chemical knowledge, and with close collaboration of the molecular modeler/ theoretical chemist with the medicinal chemist, a new series of analogs is designed. These compounds are synthesized and then tested biologically. The process is iterated until a satisfactory compound is produced for preclinical testing. When an experimental structure determination is available, the iteration process is most effective. The resulting new analogs are recrystallized with the protein or reexamined by NMR to redetermine the experimental structure of the complex of the new analogs with the protein to see whether the design concepts were correct or not. This leads to a new round of compound design, synthesis, and testing. This process is iterated, with cycles of structure determination, analysis, design, synthesis, and biological testing, until a compound with the desired properties is produced. This iterative process is the most powerful since two sets of experimental data fuel the discovery process, the biological testing and the experimental structure determination (both labeled in brown). at the interface of the N- and C-domains and exhibits approximate 2-fold symmetry at the protein backbone level. Since the sequence length of retroviral proteases is typically about one-third that of aspartic proteases, it was proposed that the former enzymes are composed of two identical subunits, each of which contributes a single aspartic acid to the active site.29 Crystal structure studies of Rous sarcoma virus (RSV) protease30 and later of HIV PR by several laboratories31-34 confirmed that the viral enzymes are homodimers. In the case of the tetragonal crystal form of apo-HIV PR, the dimer exhibits exact crystallographic Cz symmetry. The active site is formed by the dimer interface and is composed of equivalent contributions of residues from each subunit. The substrate binding cleft is bound on one side by the active-site aspartic acids, Asp25 and Asp125, and on the other by a pair of
1038 Journal of Medicinal Chemistry,1994,Vol.37.No.8 Perspective multion of foymetry-b HIV PR. hat a dim ric ae. ompose of tw he left and righ of the turally identical,ore arly so. This nee les of y amples of enzyme active sites thatr compose to which ruc ofthree-dmenod spart t was reasoned chat compo ds th woul nd.furthermore. non-r 2oldrnehtehaniee aps are dis The had t firs ibit or poss metry as the e detailed structural comparison of the etroviral and tia derabl pd bise of RSV PR &"Pg26d0ntothebackbo of porcine peps ithi P subst entor rates This e cted,most of the overall chai the s ment would b ever,the by this t w lekedhhiz 36 the cryst of RSVPR which ftnctionalrelationships region edge of the HIV PR cav pin-ho nd inhibito ve site nati re c D ted b ndthe the hl tide fold erm rotated by the tor (Figure 5B).The eometr nding Desig iteth avorably 01e as mod rmi experien R ir ss of meti rope such aila it与 itors for HIV PR rep act ich l wcycgCSrtingaubtrateforH structure fun structure of the omputer -modele nhibitor t was ho might h AIDS. ly develop di, arget cant ant activity in t dnlt0not
1038 Journal of Medicinal Chemistry, 1994, Vol. 37, No. 8 Perspective Figure 2. Ribbon drawing of the backbone of HIV-1 protease based on the crystal structure of the native enzyme.32 The activesite aspartic acid side chains are drawn in stick fashion; the flaps are the two hairpin structures at the top of the molecule. The 2-fold axis of the enzyme is vertical. Adapted from ref 24. 2-fold related, antiparallel 0-hairpin structures, or “flaps” (Figure 2). In the crystal structure of RSV protease, the flaps are disordered. In apo-HIV PR, crystal packing forces maintain the flap in a conformation that is presumably unsuitable for substrate binding.35 A detailed structural comparison of the retroviral and cellular aspartic proteases revealed that, in contrast to their limited sequence homology, they display considerable structural homology at the backbone le~e1.3~ Fully onethird of the main-chain atoms of RSV PR can be superposed onto the backbone of porcine pepsin to within a 1.5-A root-mean-square deviation (Figure 3). As expected, most of the structural correspondence is in the active-site region. However, the overall chain topologies of the two families of enzymes are more similar than a simple superposition analysis reveals and are indicative of a distant but definite relationship to a common, ancestral aspartic protease gene.36 The close structural and functional relationships between the retroviral and cellular aspartic proteases, together with knowledge of the HIV PR cleavage site sequences, immediately opened the avenue of substratebased approaches that had been developed for designing inhibitors of renin, an aspartic protease that has long been an important target for the design of antihypertensive agents.37 Substrate-based inhibitors are essentially peptide substrate analogues in which the scissile peptide bond has been replaced by a noncleavable, transition-state analogue or isostere. This approach has been used to design numerous, highly potent HIV PR inhibitors.3w1 Symmetry-Based Inhibitor Design. Despite the enormous collective synthetic effort that has been applied to the design of renin inhibitors, and more recently, HIV PR inhibitors, the usefulness of peptidomimetics as drug candidates has been hampered by their generally poor pharmacologic properties such as oral bioavailability, metabolic stability, and pharmacokinetics.42 The design of symmetry-based inhibitors for HIV PR represents a significant departure from traditional substrate-based approaches, and one in which knowledge of aspartic protease structure and function could be exploited to conceptualize novel structural classes of inhibitors which, it was hoped, might be more easily developed into potential drug candidates for AIDS.a*44 Symmetry-based inhibitors had never been designed a priori for any enzymatic target, although the concept of symmetry-based inhibitors had been discussed for renin dipeptidase45 and prostaglandin receptors.& Formulation of the design principles for symmetry-based inhibitors actually began in the absence of knowledge of the structures of either RSV or HIV PR. The hypothesis that HIV PR was a dimeric enzyme, composed of two chemically-identical subunits, led to the postulate that the left and right halves of the active site of this enzyme would be structurally identical, or nearly so. This need not necessarily have been the case; there are numerous examples of symmetric, multisubunit enzymes, but few examples of enzyme active sites that are composed of equivalent, symmetry-related subunits. Similar reasoning led to the construction of a three-dimensional model for HIV PR which embodied exact, CZ If HIV PR incorporates symmetry into its active-site structure, it was reasoned that compounds that would mimic this symmetry might be novel, potent, and specific inhibitors and, furthermore, may be sufficiently non-peptidic in character so as to be pharmacologically superior to the classical peptide-based compounds. The design strategy had two requirements: first, that the inhibitor possess the same CZ symmetry as the enzyme, and second, that the symmetry elements of the inhibitor and enzyme approximately superimpose when the inhibitor is bound in the active ~ite.~~~~ Initially a CZ symmetrybased diaminoalcohol was designed in which a pseudo-C2 axis passes through the alcohol carbon atom and bisects the 0-C-H angle (Figure 4A). Each side of the diamino alcohol resembles a phenylalanine moiety which is a common PI substituent for HIV PR substrates. This compound obviously satisfied the first constraint. In order to determine whether the second requirement would be met by this design, a modeling experiment was performed using the crystal structure of a reduced peptide inhibitor complexed with rhizopuspepsin,& a fungal aspartic protease, and the crystal structure of RSV PR which had recently been determined and, importantly, made available.30 The structurally homologous active-site regions of RSV PR and rhizopuspepsin were superimposed in order to “dock” the rhizopuspepsin-bound inhibitor into the active site of RSV PR (Figure 5A). Examination revealed that there were no close contacts. Next, the C-terminal portion of the inhibitor was deleted beyond the reduced CH2 group, and the N-terminal half was rotated by the enzyme 2-fold axis to produce a pseudo-C2 symmetric inhibitor (Figure 5B). The deviations from ideal geometry for the computer-generated inhibitor were small enough to suggest that the corresponding diamino alcohol might bind favorably in the orientation as modeled in this experiment. The decision to design symmetric inhibitors with N-terminal properties was based on the experience with renin inhibitors which retain considerable activity after C-terminal truncation.40 Symmetry-Based Diamino Alcohols. The prototype compound 1 (Table 1) was synthesized on the basis of the fact that aromatic amino acid side chains are prevalent in the PI position of naturally-occurring substrates for HIV PR. This molecule, which closely resembles the central %ore” structure of the computer-modeled inhibitor, is pseudosymmetric owing to the secondary OH group on the central carbon atom. Compound 1 was a weak inhibitor of HIV PR (I& value >200 pM) and did not exhibit significant anti-HIV activity in vitro.43 Examination of
Perspective Chemistry,1994,Vol.37.No. rCbackbo sof porcine pepsin (left)and RSV pr nd Cba-v ectively e improvement in and protec ion of the fre HO OH HN、 of the. ,A-74704 -HIV act ivity P T2-500 that the bindingsiteregion of ore9eratio e fashion was cocr extension of I by the symmetric addition of NHa-blocked 6)and included a buried ae Wat that
Perspective Journal of Medicinal Chemistry, 1994, Vol. 37, No. 8 1039 Figure 3. Structural homology of Ca backbones of porcine pepsin (left) and RSV protease (right). Structurally equivalent segments are in white. Adapted from ref 36. H I Pl c2 Pl' H I A OH I B Figure 4. Design of C2 symmetric inhibitors of HIV-1 protease: (A) placement of the C2 axis through the carbon atom produces the diamino alcohol; (B) placement of the Cz axis through the midpoint of the C-N peptide bond produces the diamino diol. Adapted from ref 44. the substrate binding site region of the modeled structure and reference to the structures of other aspartic protease binding sites suggested that the binding site region of HIV PR should encompass an inhibitor equivalent to at least a hexapeptide in length. These considerations led to the extension of 1 by the symmetric addition of NH2-blocked amino acids. The inhibitory potency for a series of diaminoalcohols was measured using a fluorogenic assaya and ranged from >10000 nM for the core structure, 1, to 3 nM for the bifunctionalized, Cbz-Val compound, 7 (Table l).44 Both acetylated and unprotected core compounds, 2 and 1, respectively, which contain a benzyl moiety in the P1 position were ineffective inhibitors. Protection with the bulkier Boc group, 3, resulted in some improvement in potency and suggested a requirement for Pp substituents. Replacement of Boc by Val, 4, gave a 5-fold enhancement, and protection of the free amino groups of 4 by acetylation, 5, resulted in a further 50-fold enhancement. The symmetric addition of acetyl-Val in the P3 position, 6, did not yield any improvement over 5, but substitution of the acetyl group in 5 by Cbz, 7, resulted in a further $fold lowering of the IC50 value. The most potent HIV PR inhibitor of this series, A-74704 (compound 7), exhibited measurable anti-HIV activity in vitro with an IC50 value 51 pM. Compound 7 also demonstrated good specificity for HIV PR over human renin (> 10 000:1), low cellular toxicity (EC50: TCM = 500: l), and was resistant to proteolytic degradation in a renal cortex homogenate at 37 "C (t1/2 >> 3 h).43 Thus, the idea that symmetry-based inhibitors could be specific, potent, and exhibit non-peptide character was partially realized in 7. Crystal Structure of Compound 7/HIV PR Complex. To verify that the diamino alcohol inhibitors bound in the predicted symmetric fashion, 7 was cocrystallized with recombinant HIV PR, and the 2.8-A crystal structure of the complex was solved in the hexagonal space group, P61.43 The inhibitor formed a symmetric pattern of hydrogen-bonding interactions with the enzyme (Figure 6) and included a buried water molecule, Wat301, that