The use of a conformational alphabet for fast alignment of protein structures

  • Authors:
  • Wei-Mou Zheng

  • Affiliations:
  • Institute of Theoretical Physics, Academia Sinica, Beijing, China

  • Venue:
  • ISBRA'08 Proceedings of the 4th international conference on Bioinformatics research and applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A protein conformational alphabet refers to the discretizedstates of the three-dimensional segmental structure of protein backbones.Here a letter corresponds to a cluster of combinations of three anglesformed by Cα pseudobonds of four contiguous residues, and our alphabetconsist of 17 letters obtained by clustering based on the probabilitydistribution of these angles. A substitution matrix called CLESUM hasbeen derived from an alignment database of representative structures tomeasure both evolutionary and geometrical similarity between any twosuch letters. A structural fragment is then mapped to a string, and twostrings with their CLESUM score being higher than a preset thresholdform a similar fragment pair (SFP). The search for SFPs by string comparisonis fast. Furthermore, CLESUM scores reflect the importance ofSFPs to structure alignment, and then the search space can be significantlyreduced. A fast tool for pairwise alignment called CLePAPS isdeveloped by collecting as many spatially consistent SFPs as possible.Extending the concept of SFPs to that of similar fragment blocks formultiple structure alignment leads to a fast tool for multiple structurealignment called BLOMAPS. Both CLePAPS and BLOMAPS are testedon ensembles of various structures. They are reliable, and about two orthree orders faster than some well-known algorithms.