Linking archives using document enrichment and term selection

  • Authors:
  • Marc Bron;Bouke Huurnink;Maarten de Rijke

  • Affiliations:
  • ISLA, University of Amsterdam, Amsterdam;ISLA, University of Amsterdam, Amsterdam;ISLA, University of Amsterdam, Amsterdam

  • Venue:
  • TPDL'11 Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

News, multimedia and cultural heritage archives are increasingly offering opportunities to create connections between their collections. We consider the task of linking archives: connecting an item in one archive to one or more items in other, often complementary archives. We focus on a specific instance of the task: linking items with a rich textual representation in a news archive to items with sparse annotations in a multimedia archive, where items should be linked if they describe the same or a related event. We find that the difference in textual richness of annotations presents a challenge and investigate two approaches: (i) to enrich sparsely annotated items with textually rich content; and (ii) to reduce rich news archive items using term selection. We demonstrate the positive impact of both approaches on linking to same events and linking to related events.