Bayesian hidden Markov model for DNA sequence segmentation: A prior sensitivity analysis

  • Authors:
  • Darfiana Nur;David Allingham;Judith Rousseau;Kerrie L. Mengersen;Ross McVinish

  • Affiliations:
  • School of Mathematical and Physical Sciences, University of Newcastle, Callaghan, NSW 2308, Australia;ARC Centre of Excellence for Complex Dynamic Systems and Control, University of Newcastle, Callaghan, NSW 2308, Australia;CEREMADE, Université Paris-Dauphine, Place du maréchal deLattre de Tassigny, Paris 75016, France;School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD 4001, Australia;School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD 4001, Australia

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.03

Visualization

Abstract

The sensitivity to the specification of the prior in a hidden Markov model describing homogeneous segments of DNA sequences is considered. An intron from the chimpanzee @a-fetoprotein gene, which plays an important role in embryonic development in mammals, is analysed. Three main aims are considered: (i) to assess the sensitivity to prior specification in Bayesian hidden Markov models for DNA sequence segmentation; (ii) to examine the impact of replacing the standard Dirichlet prior with a mixture Dirichlet prior; and (iii) to propose and illustrate a more comprehensive approach to sensitivity analysis, using importance sampling. It is obtained that (i) the posterior estimates obtained under a Bayesian hidden Markov model are indeed sensitive to the specification of the prior distributions; (ii) compared with the standard Dirichlet prior, the mixture Dirichlet prior is more flexible, less sensitive to the choice of hyperparameters and less constraining in the analysis, thus improving posterior estimates; and (iii) importance sampling was computationally feasible, fast and effective in allowing a richer sensitivity analysis.