A fast subspace text categorization method using parallel classifiers

  • Authors:
  • Nandita Tripathi;Michael Oakes;Stefan Wermter

  • Affiliations:
  • Department of Computing, Engineering and Technology, University of Sunderland, Sunderland, United Kingdom;Department of Computing, Engineering and Technology, University of Sunderland, Sunderland, United Kingdom;Institute for Knowledge Technology, Department of Computer Science, University of Hamburg, Hamburg, Germany

  • Venue:
  • CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In today's world, the number of electronic documents made available to us is increasing day by day. It is therefore important to look at methods which speed up document search and reduce classifier training times. The data available to us is frequently divided into several broad domains with many sub-category levels. Each of these domains of data constitutes a subspace which can be processed separately. In this paper, separate classifiers of the same type are trained on different subspaces and a test vector is assigned to a subspace using a fast novel method of subspace detection. This parallel classifier architecture was tested with a wide variety of basic classifiers and the performance compared with that of a single basic classifier on the full data space. It was observed that the improvement in subspace learning was accompanied by a very significant reduction in training times for all types of classifiers used.