Compact encodings for all local path information in web taxonomies with application to wordnet

  • Authors:
  • Svetlana Strunjaš-Yoshikawa;Fred S. Annexstein;Kenneth A. Berman

  • Affiliations:
  • Department of ECE and Computer Science, University of Cincinnati, Cincinnati, OH;Department of ECE and Computer Science, University of Cincinnati, Cincinnati, OH;Department of ECE and Computer Science, University of Cincinnati, Cincinnati, OH

  • Venue:
  • SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of finding a compact labelling for large, rooted web taxonomies that can be used to encode all local path information for each taxonomy element. This research is motivated by the problem of developing standards for taxonomic data, and addresses the data intensive problem of evaluating semantic similarities between taxonomic elements. Evaluating such similarities often requires the processing of large common ancestor sets between elements. We propose a new class of compact labelling schemes, designed for directed acyclic graphs, and tailored for applications to large web taxonomies. Our labelling schemes significantly reduce the complexity of evaluating similarities among taxonomy elements by enabling the gleaning of inferences from the labels alone, without searching the data structure. We provide an analysis of the label lengths for the proposed schemes based on structural properties of the taxonomy. Finally, we provide supporting empirical evidence for the quality of these schemes by evaluating the performance on the WordNet taxonomy.