Hybrid classifiers based on semantic data subspaces for two-level text categorization

Authors:
Nandita Tripathi;Michael Oakes;Stefan Wermter
Affiliations:
University of Sunderland, Sunderland, UK;University of Sunderland, Sunderland, UK;University of Hamburg, Hamburg, Germany
Venue:
International Journal of Hybrid Intelligent Systems
Year:
2013

Citing 19
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Hybrid neural plausibility networks for news agents

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Random Forests

Machine Learning
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality

Data Mining and Knowledge Discovery
Generating Accurate Rule Sets Without Global Optimization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Spoken language classification using hybrid classifier combination

International Journal of Hybrid Intelligent Systems
Discriminative learning of Bayesian network classifiers

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Constructing ensembles of symbolic classifiers

International Journal of Hybrid Intelligent Systems - Hybrid Intelligent systems in Ensembles
Survey of Improving Naive Bayes for Classification

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Boosting random subspace method

Neural Networks
Local Random Subspace Method for Constructing Multiple Decision Stumps

ICIFE '09 Proceedings of the 2009 International Conference on Information and Financial Engineering
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Co-training with relevant random subspaces

Neurocomputing
The ECIR 2010 large scale hierarchical classification workshop

ACM SIGIR Forum
Comparative performance evaluation of global-local hybrid ensemble

International Journal of Hybrid Intelligent Systems
Effectiveness of a hybrid pattern classifier for medical applications

International Journal of Hybrid Intelligent Systems
Fast training of multilayer perceptrons

IEEE Transactions on Neural Networks

Applications of Hybrid Extreme Rotation Forests for image segmentation

International Journal of Hybrid Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many organizations are nowadays keeping their data in the form of multi-level categories for easier manageability. An example of this is the Reuters Corpus which has news items categorized in a hierarchy of up to five levels. The volume and diversity of documents available in such category hierarchies is also increasing daily. As such, it becomes difficult for a traditional classifier to efficiently handle multi-level categorization of such a varied document space. In this paper, we present hybrid classifiers involving various two-classifier and four-classifier combinations for two-level text categorization. We show that the classification accuracy of the hybrid combination is better than the classification accuracies of all the corresponding single classifiers. The constituent classifiers of the hybrid combination operate on different subspaces obtained by semantic separation of data. Our experiments show that dividing a document space into different semantic subspaces increases the efficiency of such hybrid classifier combinations. We further show that hierarchies with a larger number of categories at the first level benefit more from this general hybrid architecture.