Feature Selection by Transfer Learning with Linear Regularized Models

Authors:
Thibault Helleputte;Pierre Dupont
Affiliations:
Computing Science and Engineering Dept., University of Louvain, Louvain-la-Neuve, Belgium B-1348 and Machine Learning Group, University of Louvain,;Computing Science and Engineering Dept., University of Louvain, Louvain-la-Neuve, Belgium B-1348 and Machine Learning Group, University of Louvain,
Venue:
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Year:
2009

Citing 18
Cited 5

Choosing Multiple Parameters for Support Vector Machines

Machine Learning
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Inference for the Generalization Error

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Use of the zero norm with linear models and kernel methods

The Journal of Machine Learning Research
Regularized multi--task learning

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to learn with the informative vector machine

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Outcome signature genes in breast cancer: is there a unique set?

Bioinformatics
Logistic regression with an auxiliary data source

ICML '05 Proceedings of the 22nd international conference on Machine learning
A stability index for feature selection

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
A review of feature selection techniques in bioinformatics

Bioinformatics
Self-taught clustering

Proceedings of the 25th international conference on Machine learning
Transferred Dimensionality Reduction

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Guest editor's introduction: special issue on inductive transfer learning

Machine Learning
Partially supervised feature selection with regularized linear models

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Domain adaptation for statistical classifiers

Journal of Artificial Intelligence Research
Efficient case based feature construction

ECML'05 Proceedings of the 16th European conference on Machine Learning

Review Article: Stable feature selection for biomarker discovery

Computational Biology and Chemistry
Expectation propagation for Bayesian multi-task feature selection

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Ensemble logistic regression for feature selection

PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
Stable Gene Selection from Microarray Data via Sample Weighting

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Estimating mutual information for feature selection in the presence of label noise

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel feature selection method for classification of high dimensional data, such as those produced by microarrays. It includes a partial supervision to smoothly favor the selection of some dimensions (genes) on a new dataset to be classified. The dimensions to be favored are previously selected from similar datasets in large microarray databases, hence performing inductive transfer learning at the feature level. This technique relies on a feature selection method embedded within a regularized linear model estimation. A practical approximation of this technique reduces to linear SVM learning with iterative input rescaling. The scaling factors depend on the selected dimensions from the related datasets. The final selection may depart from those whenever necessary to optimize the classification objective. Experiments on several microarray datasets show that the proposed method both improves the selected gene lists stability, with respect to sampling variation, as well as the classification performances.