PSISA: an algorithm for indexing and searching protein structure using suffix arrays

  • Authors:
  • Tarek F. Gharib;Ahmed Salah;Abdel-Badeeh M. Salem

  • Affiliations:
  • Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt;Faculty of Computer and Informatics, Zagazig University, Zagazig, Egypt;Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt

  • Venue:
  • ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Protein Structure Indexing using Suffix Array (PSISA) is a new technique provides the ability to retrieve similarities of proteins based on the proteins structures. Indexing the protein structure is one approach of searching for protein similarities. In this paper we developed our proposed technique based on novel use of suffix array. We start by converting protein structure into a sequence by extracting local feature vectors; normalization is applied to these vectors components and converts these normalized vectors into a sequence. Sequence is indexed using the suffix array structure, which is used effectively in the searching process to retrieve proteins with similar structure. Proteins with high structural similarities are ranked according to their alignment score against the query protein. The experimental results, which based on the structural classification of proteins (SCOP) dataset, show that our method outperforms existing similar methods in memory utilization. Our results show an enhancement in the memory usage with factor exceeds 35%.