Interactive high-quality text classification

  • Authors:
  • Rey-Long Liu

  • Affiliations:
  • Department of Medical Informatics, Tzu Chi University, Hualien, Taiwan, ROC

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic text classification (TC) is essential for information sharing and management. Its ideal goals are to achieve high-quality TC: (1) accepting almost all documents that should be accepted (i.e., high recall) and (2) rejecting almost all documents that should be rejected (i.e., high precision). Unfortunately, the ideal goals are rarely achieved, making automatic TC not suitable for those applications in which a classifier's erroneous decision may incur high cost and/or serious problems. One way to pursue the ideal is to consult users to confirm the classifier's decisions so that potential errors may be corrected. However, its main challenge lies on the control of the number of confirmations, which may incur heavy cognitive load on the users. We thus develop an intelligent and classifier-independent confirmation strategy ICCOM. Empirical evaluation shows that ICCOM may help various kinds of classifiers to achieve very high precision and recall by conducting fewer confirmations. The contributions are significant to the archiving and recommendation of critical information, since identification of possible TC errors (those that require confirmation) is the key to process information more properly.