Insights from Viewing Ranked Retrieval as Rank Aggregation

Authors:
Holger Bast;Ingmar Weber
Affiliations:
Max-Planck-Institut für Informatik Stuhlsatzenhausweg Saarbrucken, Germany;Max-Planck-Institut für Informatik Stuhlsatzenhausweg Saarbrucken, Germany
Venue:
WIRI '05 Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration
Year:
2005

Citing 0
Cited 1

PLSI: The True Fisher Kernel and beyond

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

We view a variety of established methods for ranked retrieval from a common angle, namely as a process of combining query-independent rankings that were precomputed for certain attributes. Apart from a general insight into what effectively distinguishes various schemes from each other, we obtain three specific results concerned with conceptbased retrieval. First, we prove that latent semantic indexing (LSI) can be implemented to answer queries in time proportional to the number of words in the query, which improves over the standard implementation by an order of magnitude; a similar result is established for LSI's probabilistic sibling PLSI. Second, we give a simple and precise characterization of the extent, to which latent semantic indexing (LSI) can deal with polysems, and when it fails to do so. Third, we demonstrate that the recombination of the intricate, yet relatively cheap mechanism of PLSI for mapping queries to attributes, with a simplistic, easy-to-compute set of document rankings gives a retrieval performance which is at least as good as that of the most sophisticated conceptbased retrieval schemes and which does not require any precomputation.