Modeling sequence evolution with kernel methods

  • Authors:
  • Margherita Bresco;Marco Turchi;Tijl Bie;Nello Cristianini

  • Affiliations:
  • Department of Mathematics and Informatics, University of Salerno, Salerno, Italy;Department of Information Engineering, University of Siena, Siena, Italy;ECS, ISIS Research Group, University of Southampton, Southampton, UK;Department of Statistics, University of California, Davis, USA

  • Venue:
  • Computational Optimization and Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We model the evolution of biological and linguistic sequences by comparing their statistical properties. This comparison is performed by means of efficiently computable kernel functions, that take two sequences as an input and return a measure of statistical similarity between them. We show how the use of such kernels allows to reconstruct the phylogenetic trees of primates based on the mitochondrial DNA (mtDNA) of existing animals, and the phylogenetic tree of Indo-European and other languages based on sample documents from existing languages.Kernel methods provide a convenient framework for many pattern analysis tasks, and recent advances have been focused on efficient methods for sequence comparison and analysis. While a large toolbox of algorithms has been developed to analyze data by using kernels, in this paper we demonstrate their use in combination with standard phylogenetic reconstruction algorithms and visualization methods.