Effective peak alignment for mass spectrometry data analysis using two-phase clustering approach

Authors:
Yu-Cheng Liu;Lien-Chin Chen;Chi-Wei Liu;Vincent S. Tseng
Affiliations:
Department of Computer Science and Information Engineering, National Cheng-Kung University, Tainan 701, Taiwan, ROC;Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, ROC;Department of Computer Science and Information Engineering, National Cheng-Kung University, Tainan, 701, Taiwan, ROC;Department of Computer Science and Information Engineering, Institute of Medical Informatics, National Cheng-Kung University, Tainan, 701, Taiwan, ROC
Venue:
International Journal of Data Mining and Bioinformatics
Year:
2014

Citing 5
Cited 0

Sample classification from protein mass spectrometry, by 'peak probability contrasts'

Bioinformatics
Proteomic mass spectra classification using decision tree based ensemble methods

Bioinformatics
Multiple Peak Alignment in Sequential Data Analysis: A Scale-Space-Based Approach

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
On Preprocessing of SELDI-MS Data and its Evaluation

CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
Detecting and aligning peaks in mass spectrometry data with applications to MALDI

Computational Biology and Chemistry

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, mass spectrometry data analysis has become an important protein identification technique. The mass spectrometry technologies emerge as useful tools for biomarker discovery through studying protein profiles in various biological specimens. In mining mass spectrometry datasets, peak alignment is a critical issue among the preprocessing steps that affect the quality of analysis results. However, the existing peak alignment methods are sensitive to noise peaks across various mass spectrometry samples. In this paper, we proposed a novel algorithm named Two-Phase Clustering for peak Alignment TPC-Align to align mass spectrometry peaks across samples in the pre-processing phase. The TPC-Align algorithm sequentially considers the distribution of intensity values and the locations of mass-to-charge ratio values of peaks between samples. Moreover, TPC-Align algorithm can also report a list of significantly differential peaks between samples, which serve as the candidate biomarkers for further biological study. The proposed peak alignment method was compared to the current peak alignment approach based on one-dimension hierarchical clustering through experimental evaluations and the results show that TPC-Align outperforms the traditional method on the real dataset.