A sparse version of the ridge logistic regression for large-scale text categorization

Authors:
Sujeevan Aseervatham;Anestis Antoniadis;Eric Gaussier;Michel Burlet;Yves Denneulin
Affiliations:
LIG - Université Joseph Fourier, 385, rue de la Bibliothèque, BP 53, F-38041 Grenoble Cedex 9, France;LJK - Université Joseph Fourier, BP 53, F-38041 Grenoble Cedex 9, France;LIG - Université Joseph Fourier, 385, rue de la Bibliothèque, BP 53, F-38041 Grenoble Cedex 9, France;Lab. Leibniz-Université Joseph Fourier, 46 Avenue Félix Viallet, F-38031 Grenoble Cedex 1, France;LIG - ENSIMAG, 51 avenue Jean Kuntzmann, F-38330 Montbonnot Saint Martin, France
Venue:
Pattern Recognition Letters
Year:
2011

Citing 9
Cited 0

A statistical interpretation of term specificity and its application in retrieval

Document retrieval systems
OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory

The nature of statistical learning theory
A vector space model for automatic indexing

Communications of the ACM
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Text Categorization Based on Regularized Linear Classification Methods

Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
On Model Selection Consistency of Lasso

The Journal of Machine Learning Research
Bolasso: model consistent Lasso estimation through the bootstrap

Proceedings of the 25th international conference on Machine learning

Quantified Score

Hi-index	0.10

Visualization

Abstract

The ridge logistic regression has successfully been used in text categorization problems and it has been shown to reach the same performance as the Support Vector Machine but with the main advantage of computing a probability value rather than a score. However, the dense solution of the ridge makes its use unpractical for large scale categorization. On the other side, LASSO regularization is able to produce sparse solutions but its performance is dominated by the ridge when the number of features is larger than the number of observations and/or when the features are highly correlated. In this paper, we propose a new model selection method which tries to approach the ridge solution by a sparse solution. The method first computes the ridge solution and then performs feature selection. The experimental evaluations show that our method gives a solution which is a good trade-off between the ridge and LASSO solutions.