Using topic shifts for focussed access to XML repositories

  • Authors:
  • Elham Ashoori;Mounia Lalmas

  • Affiliations:
  • Queen Mary, University of London, London, UK;Queen Mary, University of London, London, UK

  • Venue:
  • ECIR'07 Proceedings of the 29th European conference on IR research
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In focussed XML retrieval, a retrieval unit is an XML element that not only contains information relevant to a user query, but also is specific to the query. INEX defines a relevant element to be at the right level of granularity if it is exhaustive and specific to the user's request - i.e., it discusses fully the topic requested in the user's query and no other topics. The exhaustivity and specificity dimensions are both expressed in terms of the "quantity" of topics discussed within each element. We therefore propose to use the number of topic shifts in an XML element, to express the "quantity" of topics discussed in an element as a mean to capture specificity. We experimented with a number of element-specific smoothing methods within the language modelling framework. These methods enable us to adjust the amount of smoothing required for each XML element depending on its number of topic shifts, to capture specificity. Using the number of topic shifts combined with element length improves retrieval effectiveness, thus indicating that the number of topic shifts is a useful evidence in focussed XML retrieval.