A Clustering Based Hybrid System for Mass Spectrometry Data Analysis

Authors:
Pengyi Yang;Zili Zhang
Affiliations:
Intelligent Software and Software Engineering Laboratory, Faculty of Computer and Information Science, Southwest University, Chongqing, China 400715;Intelligent Software and Software Engineering Laboratory, Faculty of Computer and Information Science, Southwest University, Chongqing, China 400715 and School of Engineering and Information Techn ...
Venue:
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
Year:
2008

Citing 12
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Improving classification of microarray data using prototype-based feature selection

ACM SIGKDD Explorations Newsletter
Application of the GA/KNN method to SELDI proteomics data

Bioinformatics
HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data

Bioinformatics
Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum

Bioinformatics
Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data

Bioinformatics
Proteomic mass spectra classification using decision tree based ensemble methods

Bioinformatics
A review of feature selection techniques in bioinformatics

Bioinformatics
A Hybrid Approach to Selecting Susceptible Single Nucleotide Polymorphisms for Complex Disease Analysis

BMEI '08 Proceedings of the 2008 International Conference on BioMedical Engineering and Informatics - Volume 01
Hybrid methods to select informative gene sets in microarray data classification

AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence

Multiagent Framework for Bio-data Mining

RSKT '09 Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology
A clustering based hybrid system for biomarker selection and sample classification of mass spectrometry data

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, much attention has been given to the mass spectrometry (MS) technology based disease classification, diagnosis, and protein-based biomarker identification. Similar to microarray based investigation, proteomic data generated by such kind of high-throughput experiments are often with high feature-to-sample ratio. Moreover, biological information and pattern are compounded with data noise, redundancy and outliers. Thus, the development of algorithms and procedures for the analysis and interpretation of such kind of data is of paramount importance. In this paper, we propose a hybrid system for analyzing such high dimensional data. The proposed method uses the k-mean clustering algorithm based feature extraction and selection procedure to bridge the filter selection and wrapper selection methods. The potential informative mass/charge (m/z) markers selected by filters are subject to the k-mean clustering algorithm for correlation and redundancy reduction, and a multi-objective Genetic Algorithm selector is then employed to identify discriminative m/z markers generated by k-mean clustering algorithm. Experimental results obtained by using the proposed method indicate that it is suitable for m/z biomarker selection and MS based sample classification.