Automatic glossary extraction: beyond terminology identification

  • Authors:
  • Youngja Park;Roy J Byrd;Branimir K Boguraev

  • Affiliations:
  • IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Thomas J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a method for automatically extracting domain-specific glossaries from large document collections. We show that, compared with current text analysis methods for extracting technical terminology from text, our extracted glossaries more successfully support applications requiring knowledge of domain concepts. After presenting our methods, we illustrate the output of GlossEx, our glossary extraction tool, and present an informal evaluation of its performance.