In Silico Biology 6, 0053 (2006); ©2006, Bioinformation Systems e.V.  


Detailed comparison of the protein-ligand docking efficiencies of GOLD, a commercial package and ArgusLab, a licensable freeware


Saju Joy, Parvathy S. Nair, Ramkumar Hariharan and M. Radhakrishna Pillai*




Department of Molecular Medicine, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, Kerala, India



* Corresponding author
  Department of Molecular Medicine, Rajiv Gandhi Centre for Biotechnology Thycaud. P.O, Thiruvananthapuram- 695 014 Kerala State, India
  Phone: +91- (0) 471- 2347973,  Fax: +91-(0) 471- 2349303
  Email: mrpillai@vsnl.com





Edited by H. Michael; received July 20, 2006; revised and accepted October 16, 2006; published November 13, 2006



Abstract

Molecular docking and virtual screening based on molecular docking have become an integral part of many modern structure-based drug discovery efforts. Hence, it becomes a useful endeavor to evaluate existing docking programs, which can assist in the choice of the most suitable docking algorithm for any particular study. The objective of the current study was to evaluate the ability of ArgusLab 4.0, a relatively new molecular modeling package in which molecular docking is implemented, to reproduce crystallographic binding orientations and to compare its accuracy with that of a well established commercial package, GOLD. The study also aimed to evaluate the effect of the nature of the binding site and ligand properties on docking accuracy. The three dimensional structures of a carefully chosen set of 75 pharmaceutically relevant protein-ligand complexes were used for the comparative study. The study revealed that the commercial package outperforms the freely available docking engine in almost all the parameters tested. However, the study also revealed that although lagging behind in accuracy, results from ArgusLab are biologically meaningful. This taken together with the fact that ArgusLab has an easy to use graphical user interface, means that it can be employed as an effective teaching tool to demonstrate molecular docking to beginners in this area.

Keywords: molecular docking, ArgusLab, GOLD, docking accuracy, protein-ligand complex



Introduction

The last few years have witnessed the development and implementation of a range of 'molecular docking' algorithms, based on different search methods [1, 2]. The approach, which has been shown to nicely complement high-throughput screening, has had several recent successes in drug discovery [3, 4].

Molecular docking involves the prediction of ligand (small molecule, in this study) conformation and orientation, sometimes dubbed 'pose', within the active site of the molecular target.

In this context it becomes a useful endeavor to evaluate existing docking programs, which can assist in the choice of the most suitable docking algorithm for any particular study. Indeed, several such studies, evaluating and comparing the accuracies of modern docking modules like Glide, GOLD, FLExX, ICM, DOCK, have been reported in recent literature [5, 6].

The objective of the current study was to evaluate the ability of ArgusLab 4.0, a relatively new molecular modeling package in which molecular docking is implemented, to reproduce crystallographic binding orientations and to compare its accuracy with that of a well established commercial package, GOLD. The study also aimed to evaluate the effect of the nature of the binding site and ligand properties on docking accuracy.

ArgusLab4.0, distributed freely for Windows platforms by Planaria Software, has fast become a favorite introductory molecular modeling package with academics mainly because of its user-friendly interface and intuitive calculation menus [7, 8]. The ArgusDock docking engine, implemented in ArgusLab4.0, approximates an exhaustive search method, with similarities to DOCK and Glide. Flexible ligand docking is possible with ArgusLab, where the ligand is described as a torsion tree and grids are constructed that overlay the binding site. Ligand's root node (group of bonded atoms that do not have rotatable bonds) is placed on a search point in the binding site and a set of diverse and energetically favorable rotations is created. For each rotation, torsions in breadth-first order are constructed and those poses that survive the torsion search are scored. The N-lowest energy poses are retained and the final set of poses undergoes coarse minimization, re-clustering and ranking.

Genetic Optimization for Ligand Docking, (GOLD) is distributed by Cambridge Crystallographic Data Centre. GOLD employs a genetic algorithm (GA) to explore the full range of conformational flexibility of the ligand and also the rotational flexibility of selected receptor hydrogens [9, 10]. The mechanism for ligand placement is based on fitting points. Fitting points are added to hydrogen-bonding groups on protein and ligand and the program maps acceptor points in the ligand on donor points in the protein and vice versa.

GOLD also maps ligand CH groups onto hydrophobic fitting points generated on the protein binding site cavity. GOLD also optimizes flexible ligand dihedrals, ligand ring geometries, dihedrals of protein hydroxyl and amino groups and mappings of the latter. A molecular mechanics-like scoring function that includes terms for hydrogen bond, 4-8 intermolecular van der Waals bonds and 6-12 intramolecular van der Waals bonds (for the internal energy of the ligand) is employed by GOLD to rank the docked poses.

In the present study, the ability of ArgusLab to reproduce the crystallographic binding orientations of the ligand was evaluated and the results were compared to that obtained with GOLD. The protein-ligand complexes used in the study have been derived from a well-validated set of complexes, relevant to drug discovery that was reported in recent literature [5]. The accuracy of the ArgusLab docking algorithm was also assessed taking into account key features such as the nature of the binding site and the number of rotatable bonds of the ligand.



Methods

Perola et al. had generated a test set of 150 pharmaceutically relevant protein-ligand complexes, for the evaluation of docking/scoring tools [5]. The 75 protein-ligand complexes, used in our study, was derived from that master set, after eliminating those complexes that contained water molecules in their binding sites that took part in ligand binding (see Supplementary Table). We employed this initial screening because ArgusLab totally failed to dock a few preliminary test cases that contained water molecules in their ligand-binding cavity. Swiss-pdb viewer (Deep view 3.7) was used to perform many of the following steps [11]. The entire computational work was carried out on an Intel Pentium 4 processor, 2.8 GHz.

  1. Protein ('Receptor") input file preparation: For each of the protein-ligand complexes chosen for the study, a 'clean input file' was generated by removing water molecules, ions, ligands and subunits not involved in ligand binding from the original structure file. Hydrogen atoms were then added to the protein and the active site was inspected to make suitable corrections for tautomeric states of histidines, hydroxyl group orientations and protonation states of charged residues. Local minimization was then performed in the presence of restraints to relieve potential bad contacts, at the same time maintaining the protein conformation very close to that observed in the crystallographic model. The resulting receptor model was saved to a PDB file (compatible with both ArgusLab and GOLD input file formats).
  2. Ligand input file preparation: For each complex, ligand input structure was generated with Corina (Demonstration license from Molecular Networks GmbH) [12]. The ligand structure was saved to a PDB file.
  3. GOLD docking protocol: For the study, the binding pocket of each receptor was defined from the crystallographic coordinates of the ligand (residues within 3.5 Å of the ligand). Dockings were performed under 'Standard default settings' mode- number of islands was 5, population size of 100, number of operations was 100,000, a niche size of 2, and a selection pressure of 1.1.
  4. ArgusLab docking protocol: The binding site was again defined from the coordinates of the ligand in the original PDB file. Argusdock exhaustive search docking engine was used, with grid resolution of 0.40 Å. Docking precision was set to 'high precision' and 'flexible ligand docking' mode was employed for each docking run.



Results and discussion

Fig. 1A shows the percentages of the top ranking poses predicted by both GOLD and ArgusLab, within certain, defined root mean square deviation (RMSD) values from the original crystallographic pose. It can be seen that GOLD outperforms ArgusLab, in predicting the ligand pose correctly to within 2Å of the crystallographic pose. However, the surprising observation from the study is that, if one reduces the docking accuracy cutoff to 3Å RMSD value between the predicted and original poses; then, there is very little difference in the docking accuracies between ArgusLab and GOLD. This result is significant since for some research applications and for educational purposes, for which ArgusLab is primarily designed, this accuracy meets the requirements.



Figure 1: Comparison of the protein-ligand docking efficiencies of GOLD (blue bars), a commercial package and ArgusLab (brown bars). (A) Best pose vs. crystallographic pose; (B) top ten poses vs. crystallographic pose; (C) ligand rotors vs. docking accuracy; (D) degree of hydrogen bonding vs. docking accuracy.


Fig. 1B shows the docking algorithms evaluated for their sampling accuracy. The top 10 poses predicted by GOLD and ArgusLab in each case were analyzed and compared to the crystallographic pose. The percentage of poses with RMSD within 2Å from the experimental structure was 75% for GOLD and 34% for ArgusLab. This confirms the results reported by earlier studies, GOLD seems to be highly efficient in terms of sampling. However, under less stringent conditions, i. e., docking accuracy cutoff of 3Å the performance of ArgusLab is vastly improved with 68% of the top ten poses falling within 3Å of the crystallographic pose. This indicates that the thoroughness of sampling implemented in ArgusLab, although clearly not up to state-of-the art standards, still gives biologically modest results, and does not betray its use in educational purpose demonstrations and similar situations.

Fig. 1C shows the results of yet another kind of analysis, namely, effect of a ligand parameter on docking accuracy. It is a well-known fact that as the number of rotatable bonds of the ligand increases, the docking accuracy falls since a much larger conformational space has to be sampled. The conformational space sampling methods employed by GOLD and ArgusLab are totally different. The complexes in the present study were divided into two groups, ligands with 1 to 6 rotors and those with 7 to 12 rotors. For both GOLD and for ArgusLab, the docking accuracy falls by almost half as the number of rotatable bonds increases one fold. Also in both cases, accuracy of GOLD is approximately double that of ArgusLab. However, an important consideration here is that docking times in ArgusLab is typically much shorter than that of GOLD and therefore any future improvements in this regard in ArgusLab can afford to sacrifice docking times.

The effect of degree of hydrogen bonding on docking accuracy was evaluated and the results are shown on Fig. 1D. Complexes having a DHB of 0.15 or higher are classified as hydrogen bond driven whereas complexes with DHB below 0.10 are dubbed hydrophobic-burial driven ones. In complexes with DHB values between these two cut-offs, the protein-ligand interaction is stabilized by a combination of these two major forces. Unlike GOLD which performs best when the DHB value is between 0.1 and 0.15, there is no noticeable change in the docking accuracy of ArgusLab with degree of hydrogen bonding at the active site. The incentive behind carrying out this analysis is that hydrogen bonding and hydrophobic burial are the two major forces important in stabilizing protein-ligand interactions. Protein ligand complexes, in the majority of cases can be divided into those driven by hydrogen bonds and those driven by hydrophobic burial. As can be seen from the docking results, for GOLD the docking accuracy falls drastically as the active site becomes progressively hydrophobic. ArgusLab, however, appears to be refractive to this effect.

In conclusion, this study evaluated and compared the docking accuracies of a commercial docking package, GOLD against a similar more general purpose one in the public domain, ArgusLab. The study revealed that the commercial package outperforms the freely available docking engine in almost all the parameters tested. However, the study also revealed that although lagging behind in accuracy, results from ArgusLab are biologically meaningful. This taken together with the fact that ArgusLab is offered in an intuitive, easy to use graphical user interface, suggests that it can be employed as an effective teaching tool to demonstrate molecular docking to beginners in this area. More research and development of the various docking and scoring modules of ArgusLab should definitely make this software a more popular one in educational circles.



Acknowledgement

Ramkumar Hariharan is supported by a research fellowship from the Council of Scientific and Industrial Research (CSIR), Government of India [F. No. 9/553 (14)/2003].




References


  1. Taylor, R. D., Jewsbury, P. J. and Essex, J. W. (2002). A review of protein-small molecule docking methods. J. Comput. Aided Mol. Des. 16, 151-166.

  2. Halperin, I., Ma, B., Wolfson, H. and Nussinov, R. (2002). Principles of docking: an overview of search algorithms and a guide to scoring functions. Proteins 47, 409-443.

  3. Sechi, M., Sannia, L., Carta, F., Palomba, M., Dallocchio, R., Dessi, A., Derudas, M., Zawahir, Z. and Neamati, N. (2005). Design of novel bioisosteres of beta-diketo acid inhibitors of HIV-1 integrase. Antivir. Chem. Chemother. 16, 41-46.

  4. Liu, Z., Huang, C., Fan, K., Wei, P., Chen, H., Liu, S., Pei, J., Shi, L., Li, B., Yang, K., Liu, Y. and Lai, L. (2005). Virtual screening of novel noncovalent inhibitors for SARS-CoV 3C-like proteinase. J. Chem. Inf. Model. 45, 10-17.

  5. Perola, E., Walters, W. P. and Charifson, P. S. (2004). A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance. Proteins 56, 235-249.

  6. Bursulaya, B. D., Totrov, M., Abagyan, R. and Brooks, C. L. 3rd (2003). Comparative study of several algorithms for flexible ligand docking. J. Comput. Aided Mol. Des. 17, 755-763.

  7. ArgusLab 4.0, Mark A. Thompson, Planaria Software LLC, Seattle, http://www.ArgusLab.com.

  8. Thompson, M. A. (2004). Molecular docking using ArgusLab, an efficient shape-based search algorithm and the AScore scoring function. ACS meeting, Philadelphia, 172, CINF 42, PA.

  9. Jones, G., Willett, P., Glen, R. C., Leach, A. R. and Taylor, R. (1997). Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 267, 727-748.

  10. Verdonk, M. L., Cole, J. C., Hartshorn, M. J., Murray, C. W. and Taylor, R. D. (2003). Improved protein-ligand docking using GOLD. Proteins 52, 609-23.

  11. Guex, N. and Peitsch, M. C. (1997). SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 18, 2714-2123.

  12. Gasteiger, J., Rudolph, C. and Sadowski, J. (1990). Automatic generation of 3D-atomic coordinates for organic molecules. Tetrahedron Comp. Method. 3, 537-547.