A nucleotide substitution model with nearest-neighbour interactions

  • Authors:
  • Gerton Lunter;Jotun Hein

  • Affiliations:
  • Bioinformatics group, Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK;Bioinformatics group, Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: It is well known that neighbouring nucleotides in DNA sequences do not mutate independently of each other. In this paper, we introduce a context-dependent substitution model and derive an algorithm to calculate the likelihood of sequences evolving under this model. We use this algorithm to estimate neighbour-dependent substitution rates, as well as rates for dinucleotide substitutions, using a Bayesian sampling procedure. The model is irreversible, giving an arrow to time, and allowing the position of the root between a pair of sequences to be inferred without using out-groups. Results: We applied the model upon aligned human--mouse non-coding data. Clear neighbour dependencies were observed, including 17--18-fold increased CpG to TpG/CpA rates compared with other substitutions. Root inference positioned the root halfway the mouse and human tips, suggesting an approximately clock-like behaviour of the irreversible part of the subsitution process.