On optimality-ratio and coverage in ranking of joined search results

Authors:
Mirit Shalem;Yaron Kanza
Affiliations:
Department of Computer Science, Technion, Haifa, Israel;Department of Computer Science, Technion, Haifa, Israel
Venue:
Distributed and Parallel Databases
Year:
2012

Citing 27
Cited 0

The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Supporting top-k join queries in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
Mining thick skylines over large databases

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Improving web search results using affinity graph

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Finding k-dominant skylines in high dimensional space

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Less is more: probabilistic models for retrieving fewer relevant documents

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient processing of top-k dominating queries on multi-dimensional data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Evaluating rank joins with optimal cost

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Novelty and diversity in information retrieval evaluation

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Joining the results of heterogeneous search engines

Information Systems
Optimization of multi-domain queries on the web

Proceedings of the VLDB Endowment
Diversifying search results

Proceedings of the Second ACM International Conference on Web Search and Data Mining
An axiomatic approach for result diversification

Proceedings of the 18th international conference on World wide web
On Skylining with Flexible Dominance Relation

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Efficient Computation of Diverse Query Results

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Robust and efficient algorithms for rank join evaluation

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A risk minimization framework for information retrieval

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Computing the top-k maximal answers in a join of ranked lists

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
How to choose combinations in a join of search results

Proceedings of the 20th international conference companion on World wide web
On high dimensional skylines

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Search Computing: challenges and Directions

Search Computing: challenges and Directions

Quantified Score

Hi-index	0.00

Visualization

Abstract

In complex search tasks, it is often required to pose several basic search queries, join the answers to these queries, where each answer is given as a ranked list of items, and return a ranked list of combinations. However, the join result may include too many repetitions of items, and hence, frequently the entire join is too large to be useful. This can be solved by choosing a small subset of the join result. The focus of this paper is on how to choose this subset. We propose two measures for estimating the quality of result sets, namely, coverage and optimality ratio. Intuitively, maximizing the coverage aims at including in the result as many as possible appearances of items in their optimal combination, and maximizing the optimality ratio means striving to have each item appearing only in its optimal combination, i.e., only in the most highly ranked combination that contains it. One of the difficulties, when choosing the subset of the join in a complex search, is that there is a conflict between maximizing the coverage and maximizing the optimality ratio.In this paper, we introduce the measures coverage and optimality ratio. We present new semantics for complex search queries, aiming at providing high coverage and high optimality ratio. We examine the quality of the results of existing and the novel semantics, according to these two measures, and we provide algorithms for answering complex search queries under the new semantics. Finally, we present an experimental study, using Yahoo! Local Search Web Services, of the efficiency and the scalability of our algorithms, showing that complex search queries can be evaluated effectively under the proposed semantics.