Answering XML queries by means of data summaries

Authors:
Elena Baralis;Paolo Garza;Elisa Quintarelli;Letizia Tanca
Affiliations:
Politecnico di Torino, Torino, Italy;Politecnico di Torino, Torino, Italy;Politecnico di Milano, Milano, Italy;Politecnico di Milano, Milano, Italy
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2007

Citing 27
Cited 4

Unification of quantified terms

Proc. of a workshop on Graph reduction
Using integrity constraints to provide intensional answers to relational queries

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A query language and optimization techniques for unstructured data

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
GraphLog: a visual formalism for real life recursion

PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A Calculus of Communicating Systems

A Calculus of Communicating Systems
APEX: an adaptive path index for XML data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
G-Log: A Graph-Based Query Language

IEEE Transactions on Knowledge and Data Engineering
Object Exchange Across Heterogeneous Information Sources

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Computing Iceberg Queries Efficiently

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Aqua: A Fast Decision Support Systems Using Approximate Query Answers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing and Querying XML Data for Regular Path Expressions

Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Association Rules from XML Data

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Concurrency and Automata on Infinite Sequences

Proceedings of the 5th GI-Conference on Theoretical Computer Science
eXist: An Open Source Native XML Database

Revised Papers from the NODe 2002 Web and Database-Related Workshops on Web, Web-Services, and Database Systems
D(k)-index: an adaptive structural summary for graph-structured data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Design and implementation of a graphical interface to XQuery

Proceedings of the 2003 ACM symposium on Applied computing
Multiresolution Indexing of XML for Frequent Queries

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Advances in frequent itemset mining implementations: report on FIMI'03

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
XQBE (XQuery By Example): A visual interface to the standard XML query language

ACM Transactions on Database Systems (TODS)
Star-cubing: computing iceberg cubes by top-down and bottom-up integration

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Mining interesting XML-enabled association rules with templates

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Sedna: a native XML DBMS

SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Summarizing XML data by means of association rules

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology

Data and web management research at Politecnico di Milano

ACM SIGMOD Record
Mining Tree-Based Frequent Patterns from XML

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Mining flexible association rules from XML

Proceedings of the 2009 EDBT/ICDT Workshops
Semi-Automatic Ontology Construction by Exploiting Functional Dependencies and Association Rules

International Journal on Semantic Web & Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

XML is a rather verbose representation of semistructured data, which may require huge amounts of storage space. We propose a summarized representation of XML data, based on the concept of instance pattern, which can both provide succinct information and be directly queried. The physical representation of instance patterns exploits itemsets or association rules to summarize the content of XML datasets. Instance patterns may be used for (possibly partially) answering queries, either when fast and approximate answers are required, or when the actual dataset is not available, for example, it is currently unreachable. Experiments on large XML documents show that instance patterns allow a significant reduction in storage space, while preserving almost entirely the completeness of the query result. Furthermore, they provide fast query answers and show good scalability on the size of the dataset, thus overcoming the document size limitation of most current XQuery engines.