Adaptive Processing of Top-k Queries in XML

Authors:
Amelie Marian;Sihem Amer-Yahia;Nick Koudas;Divesh Srivastava
Affiliations:
Columbia University;AT&T Labs-Research;AT&T Labs-Research;AT&T Labs-Research
Venue:
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Year:
2005

Citing 21
Cited 31

Combining fuzzy information from multiple systems (extended abstract)

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
On saying “Enough already!” in SQL

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Cost-based query scrambling for initial delays

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Flexible queries over semistructured data

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
PREFER: a system for the efficient execution of multi-parametric ranked queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation

ACM Transactions on Database Systems (TODS)
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Schema-Driven Evaluation of Approximate Tree-Pattern Queries

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
The APPROXML Tool Demonstration

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Tree Pattern Relaxation

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
An initial study of overheads of eddies

ACM SIGMOD Record
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
FleXPath: flexible structure and full-text querying for XML

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On the integration of structure indexes and inverted lists

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Supporting top-K join queries in relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Structure and content scoring for XML

VLDB '05 Proceedings of the 31st international conference on Very large data bases
An efficient and versatile query engine for TopX search

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Report on the DB/IR panel at SIGMOD 2005

ACM SIGMOD Record
IO-Top-k: index-access optimized top-k query processing

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Effective top-k computation in retrieving structured documents with term-proximity support

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Fragment-based approximate retrieval in highly heterogeneous XML collections

Data & Knowledge Engineering
Querying complex structured databases

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Rank-aware XML data model and algebra: towards unifying exact match and similar match in XML

MIV'07 Proceedings of the 7th Conference on 7th WSEAS International Conference on Multimedia, Internet & Video Technologies - Volume 7
Ranking for Approximated XQuery Full-Text Queries

BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
Efficient Top-k Data Sources Ranking for Query on Deep Web

WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Efficient network aware search in collaborative tagging sites

Proceedings of the VLDB Endowment
A Prüfer Based Approach to Process Top-k Queries in XML

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Top-k Answers to Fuzzy XPath Queries

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Finding and ranking compact connected trees for effective keyword proximity search in XML documents

Information Systems
Group recommendation: semantics and efficiency

Proceedings of the VLDB Endowment
Efficient keyword search over data-centric XML documents

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
CoXML: a cooperative XML query answering system

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Towards adaptive information merging using selected XML fragments

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Adaptive relaxation for querying heterogeneous XML data sources

Information Systems
Efficient top-k search across heterogeneous XML data sources

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Exploit keyword query semantics and structure of data for effective XML keyword search

ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
Toward approximate GML retrieval based on structural and semantic characteristics

ICWE'10 Proceedings of the 10th international conference on Web engineering
Space efficiency in group recommendation

The VLDB Journal — The International Journal on Very Large Data Bases
Relaxing queries based on XML structure and content preferences

WISS'10 Proceedings of the 2010 international conference on Web information systems engineering
ArHeX: an approximate retrieval system for highly heterogeneous XML document collections

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
FLUX: content and structure matching of XPath queries with range predicates

XSym'06 Proceedings of the 4th international conference on Database and XML Technologies
Highly heterogeneous XML collections: how to retrieve precise results?

FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Optimal top-k generation of attribute combinations based on ranked lists

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Survey: An overview on XML similarity: Background, current trends and future directions

Computer Science Review
Semantic to intelligent web era: building blocks, applications, and current trends

Proceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems
Diversified top-k graph pattern matching

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

The ability to compute top-k matches to XML queries is gaining importance due to the increasing number of large XML repositories. The efficiency of top-k query evaluation relies on using scores to prune irrelevant answers as early as possible in the evaluation process. In this context, evaluating the same query plan for all answers might be too rigid because, at any time in the evaluation, answers have gone through the same number and sequence of operations, which limits the speed at which scores grow. Therefore, adaptive query processing that permits different plans for different partial matches and maximizes the best scores is more appropriate. In this paper, we propose an architecture and adaptive algorithms for efficiently computing top-k matches to XML queries. Our techniques can be used to evaluate both exact and approximate matches where approximation is defined by relaxing XPath axes. In order to compute the scores of query answers, we extend the traditional tf*idf measure to account for document structure. We conduct extensive experiments on a variety of benchmark data and queries, and demonstrate the usefulness of the adaptive approach for computing top-k queries in XML.