A cascade ranking model for efficient ranked retrieval

Authors:
Lidan Wang;Jimmy Lin;Donald Metzler
Affiliations:
University of Maryland, College Park, MD, USA;University of Maryland, College Park, MD, USA;University of Southern California, Marina del Rey, CA, USA
Venue:
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Year:
2011

Citing 23
Cited 14

Fast evaluation of structured queries for information retrieval

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Query evaluation: strategies and optimizations

Information Processing and Management: an International Journal
Static index pruning for information retrieval systems

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Robust Real-Time Face Detection

International Journal of Computer Vision
Optimization strategies for complex queries

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Adapting ranking SVM to document retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
High accuracy retrieval with multiple nested ranker

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
The impact of caching on search engines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Pruning policies for two-tiered inverted index with correctness guarantee

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic feature selection in the markov random field model for information retrieval

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
On designing and deploying internet-scale services

LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
ResIn: a combination of results caching and index pruning for high-performance web search engines
Where to stop reading a ranked list?: threshold optimization using truncated score distributions

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Modeling the Score Distributions of Relevant and Non-relevant Documents

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Reducing the risk of query expansion via robust constrained optimization

Proceedings of the 18th ACM conference on Information and knowledge management
A case study of distributed information retrieval architectures to index one terabyte of text

Information Processing and Management: an International Journal
Learning concept importance using a weighted dependence model

Proceedings of the third ACM international conference on Web search and data mining
Early exit optimizations for additive machine learned ranking systems

Proceedings of the third ACM international conference on Web search and data mining
Learning to efficiently rank

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Ranking under temporal constraints

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management

Optimized top-k processing with global page scores on block-max indexes

Proceedings of the fifth ACM international conference on Web search and data mining
Empirical comparisons of various discriminative language models for speech recognition

ROCLING '11 Proceedings of the 23rd Conference on Computational Linguistics and Speech Processing
To index or not to index: time-space trade-offs in search engines with positional ranking functions

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A math-aware search engine for math question answering system

Proceedings of the 21st ACM international conference on Information and knowledge management
Efficient and effective retrieval using selective pruning

Proceedings of the sixth ACM international conference on Web search and data mining
Optimizing top-k document retrieval strategies for block-max indexes

Proceedings of the sixth ACM international conference on Web search and data mining
ExpertRank: A topic-aware expert finding algorithm for online knowledge communities

Decision Support Systems
Training efficient tree-based models for document ranking

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
A candidate filtering mechanism for fast top-k query processing on modern cpus

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Faster and smaller inverted indices with treaps

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Fast candidate generation for real-time tweet search with bloom filter chains

ACM Transactions on Information Systems (TOIS)
Permutation indexing: fast approximate retrieval from large corpora

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Indexing Word Sequences for Ranked Retrieval

ACM Transactions on Information Systems (TOIS)
Document vector representations for feature extraction in multi-stage document ranking

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

There is a fundamental tradeoff between effectiveness and efficiency when designing retrieval models for large-scale document collections. Effectiveness tends to derive from sophisticated ranking functions, such as those constructed using learning to rank, while efficiency gains tend to arise from improvements in query evaluation and caching strategies. Given their inherently disjoint nature, it is difficult to jointly optimize effectiveness and efficiency in end-to-end systems. To address this problem, we formulate and develop a novel cascade ranking model, which unlike previous approaches, can simultaneously improve both top k ranked effectiveness and retrieval efficiency. The model constructs a cascade of increasingly complex ranking functions that progressively prunes and refines the set of candidate documents to minimize retrieval latency and maximize result set quality. We present a novel boosting algorithm for learning such cascades to directly optimize the tradeoff between effectiveness and efficiency. Experimental results show that our cascades are faster and return higher quality results than comparable ranking models.