Text classification based on data partitioning and parameter varying ensembles

Authors:
Yan-Shi Dong;Ke-Song Han
Affiliations:
Shanghai Jiao Tong University;China Research Center
Venue:
Proceedings of the 2005 ACM symposium on Applied computing
Year:
2005

Citing 9
Cited 2

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Bagging predictors

Machine Learning
Game theory, on-line prediction and boosting

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Error reduction through learning multiple descriptions

Machine Learning
Feature selection, perceptron learning, and a usability case study for text categorization

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Feature Selection for Unbalanced Class Distribution and Naive Bayes

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research

Improving Transductive Support Vector Machine by Ensembling

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Ensemble pruning for text categorization based on data partitioning

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Support vector machines (SVM) are among the best text classifiers so far. Meantimes, ensembles of classifiers are proven to be effective on many domains. It is expected that ensembles of SVM classifiers could achieve better performance. In this paper two types of ensembles on SVM classifiers, the data partitioning ensembles and heterogeneous ensembles, have been proposed and experimentally evaluated on three well-accepted collections. Major conclusions are that disjunct partitioning ensembles with stacking could achieve the best performance, and that the parameter varying ensembles are proven to be effective, meanwhile have the advantage of being deterministic.