Predicting the effectiveness of keyword queries on databases

Authors:
Shiwen Cheng;Arash Termehchy;Vagelis Hristidis
Affiliations:
University of California at Riverside, Riverside, CA, USA;University of Illinois at Urbana-Champaign, Urbana, IL, USA;University of California at Riverside, Riverside, CA, USA
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 20
Cited 0

Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Ranking robustness: a novel framework to predict query performance

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Query performance prediction

Information Systems
The TopX DB&IR engine

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Assisted querying using instant-response interfaces

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient IR-style keyword search over relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Introduction to Information Retrieval

Introduction to Information Retrieval
A Probabilistic Retrieval Model for Semistructured Data

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Effective XML Keyword Search with Relevance Oriented Ranking

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Keyword search in databases: the power of RDBMS

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Understanding queries in a search database system

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Structured annotations of web queries

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
DivQ: diversification for keyword search over structured databases

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Keyword++: a framework to improve keyword search over entity databases

Proceedings of the VLDB Endowment
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
How schema independent are schema free query interfaces?

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Overview of the INEX 2010 data centric track

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Predicting query performance via classification

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Keyword query interfaces (KQIs) for databases provide easy access to data, but often suffer from low ranking quality, i.e. low precision and/or recall, as shown in recent benchmarks. It would be useful to be able to identify queries that are likely to have low ranking quality to improve the user satisfaction. For instance, the system may suggest to the user alternative queries for such hard queries. In this paper, we analyze the characteristics of hard queries and propose a novel framework to measure the degree of difficulty for a keyword query over a database, considering both the structure and the content of the database and the query results. We evaluate our query difficulty prediction model against two relevance judgment benchmarks for keyword search on databases, INEX and SemSearch. Our study shows that our model predicts the hard queries with high accuracy. Further, our prediction algorithms incur minimal time overhead.