Discrete Applied Mathematics - Special volume on combinatorial molecular biology
Proceedings of the 1st International Conference on Intelligent Systems for Molecular Biology
Computer Methods and Programs in Biomedicine
ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis
Adaptive non-parametric identification of dense areas using cell phone records for urban analysis
Engineering Applications of Artificial Intelligence
ICCVG'12 Proceedings of the 2012 international conference on Computer Vision and Graphics
Hi-index | 0.00 |
Accurate identification of a DNA sequence depends on the ability to precisely track the time varying signal baseline in all parts of the electrophoretic trace. We propose a statistical learning formulation of the signal background estimation problem that can be solved using an Expectation-Maximization type algorithm. We also present an alternative method for estimating the background level of a signal in small size windows based on a recursive histogram computation. Both background estimation algorithms introduced here can be combined with regression methods in order to track slow and fast baseline changes occurring in different regions of a DNA chromatogram. Accurate baseline tracking improves cluster separation and thus contributes to the reduction in classification errors when the Bayesian EM (BEM) base-calling system, developed in our group (Pereira et al., Discrete Applied Mathematics, 2000), is employed to decide how many bases are "hidden" in every base-call event pattern extracted from the chromatogram.