Indexing and mining topological patterns for drug discovery

  • Authors:
  • Sayan Ranu;Ambuj K. Singh

  • Affiliations:
  • University of California, Santa Barbara, CA;University of California, Santa Barbara, CA

  • Venue:
  • Proceedings of the 15th International Conference on Extending Database Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Increased availability of large repositories of chemical compounds has created new challenges and opportunities for the application of data-mining and indexing techniques to problems in chemical informatics. The primary goal in analysis of molecular databases is to identify structural patterns that can predict biological activity. Two of the most popular approaches to representing molecular topologies are graphs and 3D geometries. As a result, the problem of indexing and mining structural patterns map to indexing and mining patterns from graph and 3D geometric databases. In this tutorial, we will first introduce the problem of drug discovery and how computer science plays a critical role in that process. We will then proceed by introducing the problem of performing subgraph and similarity searches on large graph databases. Due to the NP-hardness of the problems, a number of heuristics have been designed in recent years and the tutorial will present an overview of those techniques. Next, we will introduce the problem of mining frequent subgraph patterns along with some of their limitations that ignited the interest in the problem of mining statistically significant subgraph patterns. After presenting an in-depth survey of the techniques on mining significant subgraph patterns, the tutorial will proceed towards the problem of analyzing 3D geometric structures of molecules. Finally, we will conclude by presenting two open computer science problems that can have a significant impact in the field of drug discovery.