A fast subspace text categorization method using parallel classifiers

Authors:
Nandita Tripathi;Michael Oakes;Stefan Wermter
Affiliations:
Department of Computing, Engineering and Technology, University of Sunderland, Sunderland, United Kingdom;Department of Computing, Engineering and Technology, University of Sunderland, Sunderland, United Kingdom;Institute for Knowledge Technology, Department of Computer Science, University of Hamburg, Hamburg, Germany
Venue:
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Year:
2012

Citing 20
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Bagging predictors

Machine Learning
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Hierarchical neural networks for text categorization (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Hybrid neural plausibility networks for news agents

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Combining multiple classifiers for text categorization

Proceedings of the tenth international conference on Information and knowledge management
Random Forests

Machine Learning
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality

Data Mining and Knowledge Discovery
Generating Accurate Rule Sets Without Global Optimization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Experiments with random projections for machine learning

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A mixture-of-experts framework for text classification

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Discriminative learning of Bayesian network classifiers

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Introduction to Information Retrieval

Introduction to Information Retrieval
Naive Bayes for optimal ranking

Journal of Experimental & Theoretical Artificial Intelligence
Boosting random subspace method

Neural Networks
Local Random Subspace Method for Constructing Multiple Decision Stumps

ICIFE '09 Proceedings of the 2009 International Conference on Information and Financial Engineering
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Co-training with relevant random subspaces

Neurocomputing
Fast training of multilayer perceptrons

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

In today's world, the number of electronic documents made available to us is increasing day by day. It is therefore important to look at methods which speed up document search and reduce classifier training times. The data available to us is frequently divided into several broad domains with many sub-category levels. Each of these domains of data constitutes a subspace which can be processed separately. In this paper, separate classifiers of the same type are trained on different subspaces and a test vector is assigned to a subspace using a fast novel method of subspace detection. This parallel classifier architecture was tested with a wide variety of basic classifiers and the performance compared with that of a single basic classifier on the full data space. It was observed that the improvement in subspace learning was accompanied by a very significant reduction in training times for all types of classifiers used.