Personalised Indexing and Retrieval of Heterogeneous Structured Documents

  • Authors:
  • Gloria Bordogna;Gabriella Pasi

  • Affiliations:
  • CNR-IDPA, Bergamo, Italy 24129;CNR-ITC, Milano, Italy 20133

  • Venue:
  • Information Retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper the problem of indexing heterogeneous structured documents and of retrieving semi-structured documents is considered. We propose a flexible paradigm for both indexing such documents and formulating user queries specifying soft constraints on both documents' structure and content. At the indexing level we propose a model that achieves flexibility by constructing personalised document representations based on users' views of the documents. This is obtained by allowing users to specify their preferences on the documents' sections that they estimate to bear the most interesting information, as well as to linguistically quantify the number of sections which determine the global potential interest of the documents. At the query language level, a flexible query language for expressing soft selection conditions on both the documents' structure and content is proposed.