Expanding the taxonomies of bibliographic archives with persistent long-term themes

  • Authors:
  • Rene Schult;Myra Spiliopoulou

  • Affiliations:
  • Otto-von-Guericke-University Magdeburg;Otto-von-Guericke-University Magdeburg

  • Venue:
  • Proceedings of the 2006 ACM symposium on Applied computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

As document collections accummulate over time, some of the discussion subjects in them become outfashioned, while new ones emerge. In this paper, we address the challenge of finding such emerging and persistent "themes", i.e. subjects that live long enough to be incorporated into a taxonomy or ontology describing the document collection. Our method is based on similarity-based clustering and cluster label construction and focusses on the identification of cluster labels that "survive" changes in the constitution of the underlying population of documents, including changes in the feature space of dominant words. We conducted a set of promising experiments on the identification of themes that manifested themselves in the ACM library within the last decade.