The C-value/NC-value Method of Automatic Recognition for Multi-Word Terms

  • Authors:
  • Katerina T. Frantzi;Sophia Ananiadou;Jun-ichi Tsujii

  • Affiliations:
  • -;-;-

  • Venue:
  • ECDL '98 Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Technical terms (henceforth called simply terms), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora. The method, (C-value/NC-value), combines linguistic and statistical information. The first part, C-value enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type of multi-word terms, the nested terms. The second part, NC-value, gives: 1) a method for the extraction of term context words (words that tend to appear with terms), 2) the incorporation of information from term context words to the extraction of terms.