Learning multiple evolutionary pathways from cross-sectional data

  • Authors:
  • Niko Beerenwinkel;Jörg Rahnenführer;Martin Däumer;Daniel Hoffmann;Rolf Kaiser;Joachim Selbig;Thomas Lengauer

  • Affiliations:
  • Max Planck Institute for Informatics, Saarbrücken, Germany;Max Planck Institute for Informatics, Saarbrücken, Germany;University of Cologne, Köln, Germany;Center of Advanced European Studies and Research, Bonn, Germany;University of Cologne, Köln, Germany;Max Planck Institute of Molecular Plant Physiology, Golm, Germany;Max Planck Institute for Informatics, Saarbrücken, Germany

  • Venue:
  • RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a mixture model of trees to describe evolutionary processes that are characterized by the accumulation of permanent genetic changes. The basic building block of the model is a directed weighted tree that generates a probability distribution on the set of all patterns of genetic events. We present an EM-like algorithm for learning a mixture model of K trees and show how to determine K with a maximum likelihood approach. As a case study we consider the accumulation of mutations in the HIV-1 reverse transcriptase that are associated with drug resistance. The fitted model is statistically validated as a density estimator and the stability of the model topology is analyzed. We obtain a generative probabilistic model for the development of drug resistance in HIV that agrees with biological knowledge. Further applications and extensions of the model are discussed.