Automatic selection of class labels from a thesaurus for an effective semantic tagging of corpora

  • Authors:
  • Alessandro Cucchiarelli;Paola Velardi

  • Affiliations:
  • Università di Ancona;Università di Roma 'La Sapienza'

  • Venue:
  • ANLC '97 Proceedings of the fifth conference on Applied natural language processing
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is widely accepted that tagging text with semantic information would improve the quality of lexical learning in corpus-based NLP methods. However available on-line taxonomies are rather entangled and introduce an unnecessary level of ambiguity. The noise produced by the redundant number of tags often overrides the advantage of semantic tagging. In this paper we propose an automatic method to select from WordNet a subset of domain-appropriate categories that effectively reduce the overambiguity of WordNet, and help at identifying and categorise relevant language patterns in a more compact way. The method is evaluated against a manually tagged corpus, SEMCOR.