Simultaneous classification and relevant feature identification in high-dimensional spaces: application to molecular profiling data

Authors:
C. Bhattacharyya;L. R. Grate;A. Rizki;D. Radisky;F. J. Molina;M. I. Jordan;M. J. Bissell;I. S. Mian
Affiliations:
Division of Computer Science, University of California Berkeley, Berkeley, CA and Department of CSA, Indian Institute of Science, Bangalore 560012, India;Lawrence Berkeley National Laboratory, Life Sciences Division, Berkeley, CA;Lawrence Berkeley National Laboratory, Life Sciences Division, Berkeley, CA;Lawrence Berkeley National Laboratory, Life Sciences Division, Berkeley, CA;Lawrence Berkeley National Laboratory, Life Sciences Division, Berkeley, CA and Department of Mathematics, University of California Santa Cruz, Santa Cruz, CA;Division of Computer Science, University of California Berkeley, Berkeley, CA and Department of Statistics, University of California Berkeley, Berkeley, CA;Lawrence Berkeley National Laboratory, Life Sciences Division, Berkeley, CA;Lawrence Berkeley National Laboratory, Life Sciences Division, Berkeley, CA
Venue:
Signal Processing - Special issue: Genomic signal processing
Year:
2003

Citing 6
Cited 12

Semi-supervised support vector machines

Proceedings of the 1998 conference on Advances in neural information processing systems II
Semiparametric support vector and linear programming machines

Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Support vector machines: hype or hallelujah?

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning

Simultaneous Relevant Feature Identification and Classification in High-Dimensional Spaces

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Second Order Cone Programming Formulations for Feature Selection

The Journal of Machine Learning Research
LESS: A Model-Based Classifier for Sparse Subspaces

IEEE Transactions on Pattern Analysis and Machine Intelligence
Random subspace method for multivariate feature selection

Pattern Recognition Letters
Feature selection of radar-derived attributes with linear programming support vector machines

ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Liknon Feature Selection for Microarrays

WILF '07 Proceedings of the 7th international workshop on Fuzzy Logic and Applications: Applications of Fuzzy Sets Theory
Component-based discriminative classification for hidden Markov models

Pattern Recognition
Identification of signatures in biomedical spectra using domain knowledge

Artificial Intelligence in Medicine
Prostate cancer localization with multispectral MRI using cost-sensitive support vector machines and conditional random fields

IEEE Transactions on Image Processing
A sparse nearest mean classifier for high dimensional multi-class problems

Pattern Recognition Letters
Variable selection and prediction of rainfall from WSR-88D radar using support vector regression

NN'05 Proceedings of the 6th WSEAS international conference on Neural networks
Robustness analysis of eleven linear classifiers in extremely high–dimensional feature spaces

ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Molecular profiling technologies monitor many thousands of transcripts, proteins, metabolites or other species concurrently in a biological sample of interest. Given such high-dimensional data for different types of samples, classification methods aim to assign specimens to known categories. Relevant feature identification methods seek to define a subset of molecules that differentiate the samples. This work describes LIKNON, a specific implementation of a statistical approach for creating a classifier and identifying a small number of relevant features simultaneously. Given two-class data, LIKNON estimates a sparse linear classifier by exploiting the simple and well-known property that minimising an L1 norm (via linear programming) yields a sparse hyperplane. It performs well when used for retrospective analysis of three cancer biology profiling data sets, (i) small, round, blue cell tumour transcript profiles from tumour biopsies and cell lines, (ii) sporadic breast carcinoma transcript profiles from patients with distant metastases