Bayesian Basecalling for DNA Sequence Analysis Using Hidden Markov Models
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Statistical analysis of electrophoresis time series for improving basecalling in DNA sequencing
MDA'06/07 Proceedings of the 2007 international conference on Advances in mass data analysis of signals and images in medicine biotechnology and chemistry
Joint base-calling of two DNA sequences with factor graphs
IEEE Transactions on Information Theory - Special issue on information theory in molecular biology and neuroscience
Hi-index | 0.00 |
DNA sequencing may be considered as a two stage process: the generation of noisy data indicative of DNA sequence by using advanced chemical techniques; and the interpretation of that data. We present an algorithm for interpretation, or "base calling", which accurately models the underlying process, and is able to incorporate most of the prior information we possess in a mathematically tractable and minimally ad-hoc manner. Our algorithm is framed within a fully Bayesian probabilistic framework, thereby allowing representation of the random nature of the generative process, using a Reversible Jump Metropolis Hastings algorithm (1970) and the Gibbs sampler to traverse the variable dimension parameter space. The techniques used to construct our algorithm are feasible for use in such applications, due to their inherent computational requirements.