Combining incompleteness and ranking in tree queries

Authors:
Benny Kimelfeld;Yehoshua Sagiv
Affiliations:
The Selim and Rachel Benin School of Engineering and Computer Science, The Hebrew University of Jerusalem, Jerusalem, Israel;The Selim and Rachel Benin School of Engineering and Computer Science, The Hebrew University of Jerusalem, Jerusalem, Israel
Venue:
ICDT'07 Proceedings of the 11th international conference on Database Theory
Year:
2007

Citing 14
Cited 4

On generating all maximal independent sets

Information Processing Letters
Outerjoins as disjunctions

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Queries with incomplete answers over semistructured data

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Flexible queries over semistructured data

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
XIRQL: a query language for information retrieval in XML documents

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Proximity Search in Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Twig query processing over graph-structured XML data

Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
An efficient and versatile query engine for TopX search

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Finding and approximating top-k answers in keyword proximity search

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Full disjunctions: polynomial-delay iterators in action

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Supporting top-K join queries in relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Narrowed extended XPath i (NEXI)

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Incrementally computing ordered answers of acyclic conjunctive queries

NGITS'06 Proceedings of the 6th international conference on Next Generation Information Technologies and Systems

Matching twigs in probabilistic XML

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient processing of twig pattern matching in fuzzy XML

Proceedings of the 18th ACM conference on Information and knowledge management
Matching twigs in fuzzy XML

Information Sciences: an International Journal
Querying and ranking incomplete twigs in probabilistic XML

World Wide Web

Quantified Score

Hi-index	0.02

Visualization

Abstract

In many cases, users may want to consider incomplete answers to their queries. Often, however, there is an overwhelming number of such answers, even if subsumed answers are ignored and only maximal ones are considered. Therefore, it is important to rank answers according to their degree of incompleteness and, moreover, this ranking should be combined with other, conventional ranking techniques that are already in use (e.g., the relevance of answers to keywords). Query evaluation should take the ranking into account by computing answers incrementally, i.e., in ranked order. In particular, the evaluation process should generate the top-k answers efficiently. We show how a semantics for incomplete answers to tree queries can be combined with common ranking techniques. In our approach, answers are rewarded for relevancy and penalized for incompleteness, where the user specifies the appropriate quantum. An incremental algorithm for evaluating tree queries is given. This algorithm enumerates in ranked order with polynomial delay, under query-and-data complexity. Our results are couched in terms of a formal framework that captures a variety of data models (e.g., relational, semistructured and XML).