Automatic selection of near-native protein-ligand conformations using a hierarchical clustering and volunteer computing

Authors:
Trlce Estrada;Roger Armen;Michela Taufer
Affiliations:
University of Delaware, Newark, DE;University of Michigan Ann Arbor, Ann Arbor, MI;University of Delaware, Newark, DE
Venue:
Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Year:
2010

Citing 3
Cited 2

Efficient Implementation of the Fuzzy c-Means Clustering Algorithms

IEEE Transactions on Pattern Analysis and Machine Intelligence
BOINC: A System for Public-Resource Computing and Storage

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Automatic clustering of docking poses in virtual screening process using self-organizing map

Bioinformatics

Poster: study of protein-ligand binding geometries using a scalable and accurate octree-based algorithm in mapReduce

Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion
A scalable and accurate method for classifying protein-ligand binding geometries using a MapReduce approach

Computers in Biology and Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Docking simulations are commonly used to understand drug binding and require the search of a large space of proteinligand conformations. Cloud and volunteer computing enable computationally expensive docking simulations at a rate never seen before but at the same time require scientists to deal with larger datasets. When analysing these datasets, a common practice is to reduce the resulting number of candidates up to 10 to 100 conformations based on energy values and then leave the scientists with the tedious task of subjectively selecting a possible near-native ligand. Scientists normally perform this task manually by using visual tools. Not only the manual process still depends on inaccurate energy scoring but also can be highly error-prone. The contributions of this paper are twofold: First, we address the problem of extensively searching large spaces of protein-ligand docking conformations, supported by the volunteer computing project Docking@Home (D@H). Second, we address the problem of accurately, and automatically, selecting near-native ligand conformations from the large number of D@H results by using a probabilistic hierarchical clustering based on ligand geometry. Our method holds up even when we test for a search that is not biased by starting from near-native ligand conformations and clearly outperforms energy-based scoring methods.