Detecting and aligning peaks in mass spectrometry data with applications to MALDI

  • Authors:
  • Weichuan Yu;Baolin Wu;Ning Lin;Kathy Stone;Kenneth Williams;Hongyu Zhao

  • Affiliations:
  • Department of Molecular Biophysics and Biochemistry, Yale University, Suite 503, 300 George Street, New Haven, CT 06520, USA;Division of Biostatistics, School of Public Health, University of Minnesota, A442 Mayo Building, 420 Delaware St. SE, Minneapolis, MN 55455, USA;Department of Electrical Engineering, Yale University, BML 331, 310 Cedar Street, New Haven, CT 06520-8042, USA;Keck Laboratory, Yale University, 300 George Street, G001, New Haven, CT 06520, USA;Keck Laboratory, Yale University, 300 George Street, G005, New Haven, CT 06520, USA;Department of Epidemiology and Public Health, Yale University, 60 College Street, LEPH 200, New Haven, CT 06520, USA

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we address the peak detection and alignment problem in the analysis of mass spectrometry data. To deal with the peak redundancy problem existing in the MALDI data acquired in the reflectron mode, we propose to use the amplitude modulation technique in peak detection. The alignment of two peak sets is formulated as a non-rigid registration problem and is solved using a robust point matching (RPM) approach. To align multiple peak sets, we first use a super set method to find a common peak set among all peak sets as a standard and then align all peak sets to the standard using the robust point matching approach in a sequential manner (i.e. We align only one peak set to the standard each time, thus reducing the multiple peak set alignment problem to a simpler two peak set alignment problem). Experimental results from a study of ovarian cancer data set show that the quantitative cross-correlation coefficients among technical replicates are increased after peak alignment. Additional comparisons also demonstrate that our method has a similar performance as the hierarchical clustering method, although the implementations of these methods are different.