A topological embedding of the lexicon for semantic distance computation

  • Authors:
  • N. Davis;C. Giraud-carrier;D. Jensen

  • Affiliations:
  • Department of computer science, brigham young university, provo, ut 84602, usa e-mail: cgc@cs.byu.edu;Department of computer science, brigham young university, provo, ut 84602, usa e-mail: cgc@cs.byu.edu;Kj nova, inc., provo, ut 84601, usa

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show how a quantitative context may be established for what is essentially qualitative in nature by topologically embedding a lexicon (here, WordNet) in a complete metric space. This novel transformation establishes a natural connection between the order relation in the lexicon (e.g., hyponymy) and the notion of distance in the metric space, giving rise to effective word-level and document-level lexical semantic distance measures. We provide a formal account of the topological transformation and demonstrate the value of our metrics on several experiments involving information retrieval and document clustering tasks.