The Design and Implementation of an Electronic Lexical Knowledge Base

  • Authors:
  • Mario Jarmasz;Stan Szpakowicz

  • Affiliations:
  • -;-

  • Venue:
  • AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Thesauri have always been a useful resource for natural language processing. WordNet, a kind of thesaurus, has proven invaluable in computational linguistics. We present the various applications of Roget's Thesaurus in this field and discuss the advantages of its structure. We evaluate the merits of the 1987 edition of Penguin's Roget's Thesaurus of English Words and Phrases as an NLP resource: we design and implement an electronic lexical knowledge base with its material. An extensive qualitative and quantitative comparison of Roget's and WordNet has been performed, and the ontologies as well as the semantic relations of both thesauri contrasted. We discuss the design in Java of the lexical knowledge base, and its potential applications. We also propose a framework for measuring similarity between concepts and annotating Roget's semantic links with WordNet labels.