Determining Molecular Similarity for Drug Discovery using the Wavelet Riemannian Metric

Authors:
Elinor Velasquez;Emmanuel R. Yera;Rahul Singh
Affiliations:
San Francisco State University, San Francisco, CA;San Francisco State University, San Francisco, CA;San Francisco State University, San Francisco, CA
Venue:
BIBE '06 Proceedings of the Sixth IEEE Symposium on BionInformatics and BioEngineering
Year:
2006

Citing 0
Cited 1

Efficient calculation of compound similarity based on maximum common subgraphs and its application to prediction of gene transcript levels

International Journal of Bioinformatics Research and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discerning the similarity between two molecules is a challenging problem in drug discovery as well as in molecular biology. The importance of this problem is due to the fact that the biochemical characteristics of a molecule are closely related to its structure. Therefore molecular similarity is a key notion in investigations targeted at understanding existing molecules as well as in guiding the synthesis of new molecules. Additionally, the notion of molecular similarity plays a central role in structure query-retrieval. This paper presents a Wavelet-based Riemannian metric for determining molecular similarity. The proposed metric extends traditional molecular similarity measures in terms of its ability to capture and compare nonlinear molecular descriptors, thus allowing more accurate characterization of the true nature of the factors involved. Furthermore, owing to its metric properties and wavelet nature, this similarity measure supports highly efficient query-retrieval strategies. To compare graph-based molecular representations using the wavelet-based Riemannian metric, the paper uses a two-phase molecular graph matching strategy. In the first step, an efficient nonlinear graph-matching technique based on the graduated assignment algorithm is used to obtain a preliminary correspondence between molecular graphs in terms of their topological characteristics. Starting from this correspondence, the second stage directly optimizes the proposed metric on arbitrary molecular descriptors using a branch-and-bound search strategy. Various experiments, many in comparative settings, study the retrieval performance of this similarity formulation and underline its efficacy and efficiency.