Classification using partial least squares with penalized logistic regression

Authors:
Gersende Fort;Sophie Lambert-Lacroix
Affiliations:
CNRS/LMC-IMAG BP 53, 38041 Grenoble cedex 9, France;CNRS/LMC-IMAG BP 53, 38041 Grenoble cedex 9, France
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 18

Local likelihood regression in generalized linear single-index models with applications to microarray data

Computational Statistics & Data Analysis
Consensus analysis of multiple classifiers using non-repetitive variables: Diagnostic application to microarray gene expression data

Computational Biology and Chemistry
Extracting gene regulation information for cancer classification

Pattern Recognition
Constructing the gene regulation-level representation of microarray data for cancer classification

Journal of Biomedical Informatics
Estimation of the conditional risk in classification: The swapping method

Computational Statistics & Data Analysis
Biological pathways as features for microarray data classification

Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
A neural network-based biomarker association information extraction approach for cancer classification

Journal of Biomedical Informatics
Ant Colony Optimisation Classification for Gene Expression Data Analysis

RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Transcriptional gene regulatory network reconstruction through cross platform gene network fusion

PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Data mining of gene expression data by fuzzy and hybrid fuzzy methods

IEEE Transactions on Information Technology in Biomedicine
Matched Gene Selection and Committee Classifier for Molecular Classification of Heterogeneous Diseases

The Journal of Machine Learning Research
Regularized logistic regression without a penalty term: An application to cancer classification with microarray data

Expert Systems with Applications: An International Journal
Design of fuzzy expert system for microarray data classification using a novel Genetic Swarm Algorithm

Expert Systems with Applications: An International Journal
Missing value imputation framework for microarray significant gene selection and class prediction

BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Collateral missing value estimation: robust missing value estimation for consequent microarray data processing

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Tissue classification using gene expression data and artificial neural network ensembles

ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
Selecting significant genes by randomization test for cancer classification using gene expression data

Journal of Biomedical Informatics
Nonnegative Least-Squares Methods for the Classification of High-Dimensional Biological Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: One important aspect of data-mining of microarray data is to discover the molecular variation among cancers. In microarray studies, the number n of samples is relatively small compared to the number p of genes per sample (usually in thousands). It is known that standard statistical methods in classification are efficient (i.e. in the present case, yield successful classifiers) particularly when n is (far) larger than p. This naturally calls for the use of a dimension reduction procedure together with the classification one. Results: In this paper, the question of classification in such a high-dimensional setting is addressed. We view the classification problem as a regression one with few observations and many predictor variables. We propose a new method combining partial least squares (PLS) and Ridge penalized logistic regression. We review the existing methods based on PLS and/or penalized likelihood techniques, outline their interest in some cases and theoretically explain their sometimes poor behavior. Our procedure is compared with these other classifiers. The predictive performance of the resulting classification rule is illustrated on three data sets: Leukemia, Colon and Prostate. Availability: Software that implements the procedures and data source on which this paper focuses are freely available at http://www-lmc.imag.fr/SMS/membres/Gersende_Fort,Sophie_Lambert.html Contact: sophie.lambert@imag.fr