Labeling Nodes of Automatically Generated Taxonomy for Multi-type Relational Datasets

  • Authors:
  • Tao Li;Sarabjot S. Anand

  • Affiliations:
  • Department of Computer Science, University of Warwick, Coventry, United Kingdom;Department of Computer Science, University of Warwick, Coventry, United Kingdom

  • Venue:
  • DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic Taxonomy Generation organizes a large dataset into a hierarchical structure so as to facilitate people's navigation and browsing actions. To better summarize the content of each node as well as to reflect the distinctiveness between sibling ones, meaningful labels need to be assigned to all the nodes within a derived taxonomy. Current research only focuses on labeling taxonomies that are built from a corpora of textual documents. In this paper we address the problem of labeling taxonomies built for multi-type relational datasets. A novel measure is proposed to quantitatively evaluate the homogeneity of each node and the heterogeneity of its sibling nodes using information-theoretical techniques, based on which the labels of taxonomic nodes are determined. We perform some experiments on a real dataset to prove the effectiveness of our method.