The Relation between Indel Length and Functional Divergence: A Formal Study

  • Authors:
  • Raheleh Salari;Alexander Schönhuth;Fereydoun Hormozdiari;Artem Cherkasov;S. Cenk Sahinalp

  • Affiliations:
  • School of Computing Science, Simon Fraser University, BC, Canada V5A 1S6 and The authors contributed equally,;School of Computing Science, Simon Fraser University, BC, Canada V5A 1S6 and The authors contributed equally,;School of Computing Science, Simon Fraser University, BC, Canada V5A 1S6;Division of Infectious Diseases, Faculty of Medicine, University of British Columbia, BC, Canada V5Z 3J5;School of Computing Science, Simon Fraser University, BC, Canada V5A 1S6

  • Venue:
  • WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although insertions and deletions (indels) are a common type of evolutionary sequence variation, their origins and their functional consequences have not been comprehensively understood. There is evidence that, on one hand, classical alignment procedures only roughly reflect the evolutionary processes and, on the other hand, that they cause structural changes in the proteins' surfaces.We first demonstrate how to identify alignment gaps that have been introduced by evolution to a statistical significant degree, by means of a novel, sound statistical framework, based on pair hidden Markov models (HMMs). Second, we examine paralogous protein pairs in E. coli, obtained by computation of classical global alignments. Distinguishing between indel and non-indel pairs, according to our novel statistics, revealed that, despite having the same sequence identity, indel pairs are significantly less functionally similar than non-indel pairs, as measured by recently suggested GO based functional distances. This suggests that indels cause more severe functional changes than other types of sequence variation and that indel statistics should be taken into additional account to assess functional similarity between paralogous protein pairs.