Signal background estimation and baseline correction algorithms for accurate DNA sequencing

  • Authors:
  • Lucio Andrade;Elias S. Manolakos

  • Affiliations:
  • Electrical and Computer Engineering Department, Communications and Digital Signal Processing (CDSP), Center for Research and Graduate Studies, Northeastern University, Boston, MA;Electrical and Computer Engineering Department, Communications and Digital Signal Processing (CDSP), Center for Research and Graduate Studies, Northeastern University, Boston, MA

  • Venue:
  • Journal of VLSI Signal Processing Systems - Special issue on signal processing and neural networks for bioinformatics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Accurate identification of a DNA sequence depends on the ability to precisely track the time varying signal baseline in all parts of the electrophoretic trace. We propose a statistical learning formulation of the signal background estimation problem that can be solved using an Expectation-Maximization type algorithm. We also present an alternative method for estimating the background level of a signal in small size windows based on a recursive histogram computation. Both background estimation algorithms introduced here can be combined with regression methods in order to track slow and fast baseline changes occurring in different regions of a DNA chromatogram. Accurate baseline tracking improves cluster separation and thus contributes to the reduction in classification errors when the Bayesian EM (BEM) base-calling system, developed in our group (Pereira et al., Discrete Applied Mathematics, 2000), is employed to decide how many bases are "hidden" in every base-call event pattern extracted from the chromatogram.