Metadata inference for document retrieval in a distributed repository

  • Authors:
  • P. Rigaux;N. Spyratos

  • Affiliations:
  • Laboratoire de Recherche en Informatique, Université Paris-Sud Orsay, France;Laboratoire de Recherche en Informatique, Université Paris-Sud Orsay, France

  • Venue:
  • ASIAN'04 Proceedings of the 9th Asian Computing Science conference on Advances in Computer Science: dedicated to Jean-Louis Lassez on the Occasion of His 5th Cycle Birthday
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a simple data model for the composition and metadata management of documents in a distributed setting. We assume that each document resides at the local repository of its provider, so all providers’ repositories, collectively, can be thought of as a single database of documents spread over the network. Providers willing to share their documents with other providers in the network must register them with a coordinator, or mediator, and providers that search for documents matching their needs must address their queries to the mediator. The process of registering (or un-registering) a document, formulating a query to the mediator, or answering a query by the mediator, all rely on document content annotation. Content annotation depends on the nature of the document: if the document is atomic then an annotation provided explicitely by the author is sufficient, whereas if the document is composite then the author annotation should be augmented by an implied annotation, i.e., an annotation inferred from the annotations of the document’s components. The main contributions of this paper are: Providing appropriate definitions of document annotations; Providing an algorithm for the automatic computation of implied annotations; Defining the main services that the mediator should support.