Towards Index-based Similarity Search for Protein Structure Databases

  • Authors:
  • Orhan Çamoglu;Tamer Kahveci;Ambuj K. Singh

  • Affiliations:
  • -;-;-

  • Venue:
  • CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose two methods for finding similarities in proteinstructure databases. Our techniques extract featurevectors on triplets of SSEs (Secondary Structure Elements)of proteins. These feature vectors are then indexed using amultidimensional index structure. Our first technique considersthe problem of finding proteins similar to a givenquery protein in a protein dataset. This technique quicklyfinds promising proteins using the index structure. Theseproteins are then aligned to the query protein using a popularpairwise alignment tool such as VAST. We also develop anovel statistical model to estimate the goodness of a matchusing the SSEs. Our second technique considers the problemof joining two protein datasets to find an all-to-all similarity.Experimental results show that our techniques improvethe pruning time of VAST 3 to 3.5 times while keepingthe sensitivity similar.