Evaluation of automatically identified index terms for browsing electronic documents

  • Authors:
  • Nina Wacholder;Judith L. Klavans;David K. Evans

  • Affiliations:
  • Columbia University;Columbia University;Columbia University

  • Venue:
  • ANLC '00 Proceedings of the sixth conference on Applied natural language processing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an evaluation of domainindependent natural language tools for use in the identification of significant concepts in documents. Using qualitative evaluation, we compare three shallow processing methods for extracting index terms, i.e., terms that can be used to model the content of documents. We focus on two criteria: quality and coverage. In terms of quality alone, our results show that technical term (TT) extraction [Justeson and Katz 1995] receives the highest rating. However, in terms of a combined quality and coverage metric, the Head Sorting (HS) method, described in [Wacholder 1998], outperforms both other methods, keyword (KW) and TT.