Efficient Implementation of the Fuzzy c-Means Clustering Algorithms
IEEE Transactions on Pattern Analysis and Machine Intelligence
BOINC: A System for Public-Resource Computing and Storage
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion
Computers in Biology and Medicine
Hi-index | 0.00 |
Docking simulations are commonly used to understand drug binding and require the search of a large space of proteinligand conformations. Cloud and volunteer computing enable computationally expensive docking simulations at a rate never seen before but at the same time require scientists to deal with larger datasets. When analysing these datasets, a common practice is to reduce the resulting number of candidates up to 10 to 100 conformations based on energy values and then leave the scientists with the tedious task of subjectively selecting a possible near-native ligand. Scientists normally perform this task manually by using visual tools. Not only the manual process still depends on inaccurate energy scoring but also can be highly error-prone. The contributions of this paper are twofold: First, we address the problem of extensively searching large spaces of protein-ligand docking conformations, supported by the volunteer computing project Docking@Home (D@H). Second, we address the problem of accurately, and automatically, selecting near-native ligand conformations from the large number of D@H results by using a probabilistic hierarchical clustering based on ligand geometry. Our method holds up even when we test for a search that is not biased by starting from near-native ligand conformations and clearly outperforms energy-based scoring methods.