Using topic shifts for focussed access to XML repositories

Authors:
Elham Ashoori;Mounia Lalmas
Affiliations:
Queen Mary, University of London, London, UK;Queen Mary, University of London, London, UK
Venue:
ECIR'07 Proceedings of the 29th European conference on IR research
Year:
2007

Citing 10
Cited 1

A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Language Modeling for Information Retrieval

Language Modeling for Information Retrieval
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Report on the INEX 2003 workshop

ACM SIGIR Forum
The Importance of Length Normalization for XML Retrieval

Information Retrieval
Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Hierarchical language models for XML component retrieval

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Using structural relationships for focused XML retrieval

FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Generating and retrieving text segments for focused access to scientific documents

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval

Processing keyword search on XML: a survey

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

In focussed XML retrieval, a retrieval unit is an XML element that not only contains information relevant to a user query, but also is specific to the query. INEX defines a relevant element to be at the right level of granularity if it is exhaustive and specific to the user's request - i.e., it discusses fully the topic requested in the user's query and no other topics. The exhaustivity and specificity dimensions are both expressed in terms of the "quantity" of topics discussed within each element. We therefore propose to use the number of topic shifts in an XML element, to express the "quantity" of topics discussed in an element as a mean to capture specificity. We experimented with a number of element-specific smoothing methods within the language modelling framework. These methods enable us to adjust the amount of smoothing required for each XML element depending on its number of topic shifts, to capture specificity. Using the number of topic shifts combined with element length improves retrieval effectiveness, thus indicating that the number of topic shifts is a useful evidence in focussed XML retrieval.