XML-Structured documents: retrievable units and inheritance

  • Authors:
  • Stephen Robertson;Wei Lu;Andrew MacFarlane

  • Affiliations:
  • Microsoft Research, Cambridge, UK;Center for Studies of Information Resources, School of Information Management, Wuhan University, China;Centre for Interactive Systems Research, Department of Information Science, City University, London, UK

  • Venue:
  • FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the retrieval of XML-structured documents, and of passages from such documents, defined as elements of the XML structure. These are considered from the point of view of passage retrieval, as a form of document retrieval. A retrievable unit (an element chosen as defining suitable passages for retrieval) is a textual document in its own right, but may inherit information from the other parts of the same document. Again, this inheritance is defined in terms of the XML structure. All retrievable units are mapped onto a common field structure, and the ranking function is a standard document retrieval function with a suitable field weighting. A small experiment to demonstrate the idea, using INEX data, is described.