Boosting RVM Classifiers for Large Data Sets

Authors:
Catarina Silva;Bernardete Ribeiro;Andrew H. Sung
Affiliations:
School of Technology and Management, Polytechnic Institute of Leiria, Portugal and Dep. Informatics Eng., Center Informatics and Systems, Univ. of Coimbra, Portugal;Dep. Informatics Eng., Center Informatics and Systems, Univ. of Coimbra, Portugal;Dep. Comp. Science, Inst. Complex Additive Sys. Analysis, New Mexico Tech, USA
Venue:
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part II
Year:
2007

Citing 8
Cited 0

An evaluation of phrasal and clustered representations on a text categorization task

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory

The nature of statistical learning theory
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Information Retrieval

Information Retrieval
Hierarchical Text Categorization Using Neural Networks

Information Retrieval
Sparse on-line Gaussian processes

Neural Computation
A scalability analysis of classifiers in text categorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Relevance Vector Machines (RVM) extend Support Vector Machines (SVM) to have probabilistic interpretations, to build sparse training models with fewer basis functions (i.e., relevance vectors or prototypes), and to realize Bayesian learning by placing priors over parameters (i.e., introducing hyperparameters). However, RVM algorithms do not scale up to large data sets. To overcome this problem, in this paper we propose a RVM boosting algorithm and demonstrate its potential with a text mining application. The idea is to build weaker classifiers, and then improve overall accuracy by using a boosting technique for document classification. The algorithm proposed is able to incorporate all the training data available; when combined with sampling techniques for choosing the working set, the boosted learning machine is able to attain high accuracy. Experiments on REUTERS benchmark show that the results achieve competitive accuracy against state-of-the-art SVM; meanwhile, the sparser solution found allows real-time implementations.