ACTION: automatic classification for full-text documents

  • Authors:
  • Jacqueline W. T. Wong;W. K. Kan;Gilbert Young

  • Affiliations:
  • Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong

  • Venue:
  • ACM SIGIR Forum
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

An important step in building up the document database of a full-text retrieval system is to classify each document under one or more classes according to the topical domains that the document discusses. This is commonly referred to as classification. Automatic classification attempts to replace human classifiers by using computers to automate this process. Automatic classification has two major components: (1) the classification scheme which defines the available classes under which a document can be classified and their inter-relationships; and (2) the classification algorithm which defines the rules and procedures for assigning one or more classes defined in the classification scheme to a document.In this paper, we present an automatic classification approach called ACTION. The design goal of ACTION is to achieve the appropriate balance between specificity and exhaustivity, which are important metrics for assessing an automatic classification approach. The key idea of ACTION is a scheme for measuring the significance of each keyword in a given document. The scheme not only takes into account the occurrence frequency of a keyword, but also the logical relationships between the available classes.