Advanced query processing in object bases using access support relations
Proceedings of the sixteenth international conference on Very large databases
Optimal histograms for limiting worst-case error propagation in the size of join results
ACM Transactions on Database Systems (TODS)
Regular expressions into finite automata
Theoretical Computer Science
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
One-unambiguous regular languages
Information and Computation
Storing semistructured data with STORED
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Handbook of Formal Languages
Statistical synopses for graph-structured XML databases
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Estimating Answer Sizes for XML Queries
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Counting Twig Matches in a Tree
Proceedings of the 17th International Conference on Data Engineering
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Relational Databases for Querying XML Documents: Limitations and Opportunities
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Estimating the Selectivity of XML Path Expressions for Internet Scale Applications
Proceedings of the 27th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
The Volcano Optimizer Generator: Extensibility and Efficient Search
Proceedings of the Ninth International Conference on Data Engineering
From XML Schema to Relations: A Cost-Based Approach to XML Storage
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Adaptive XML Shredding: Architecture, Implementation, and Challenges
Proceedings of the VLDB 2002 Workshop EEXTT and CAiSE 2002 Workshop DTWeb on Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web-Revised Papers
Containment join size estimation: models and methods
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Dynamic XML documents with distribution and replication
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XQuery speedup using replication in mapping XML into relations
Proceedings of the 2003 ACM symposium on Applied computing
Building XML statistics for the hidden web
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Selectivity Estimation for XML Twigs
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
A Flexible Infrastructure for Gathering XML Statistics and Estimating Query Cardinality
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
IMAX: Incremental Maintenance of Schema-Based XML Statistics
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Statistical learning techniques for costing XML queries
VLDB '05 Proceedings of the 31st international conference on Very large data bases
CXHist: an on-line classification-based histogram for XML string selectivity estimation
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Storing XML (with XSD) in SQL Databases: Interplay of Logical and Physical Designs
IEEE Transactions on Knowledge and Data Engineering
Cost-based optimization in DB2 XML
IBM Systems Journal
XSKETCH synopses for XML data graphs
ACM Transactions on Database Systems (TODS)
Information Systems
An efficient infrastructure for native transactional XML processing
Data & Knowledge Engineering
Structure and value synopses for XML data graphs
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
LegoDB: customizing relational storage for XML documents
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Mixed mode XML query processing
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Bloom histogram: path selectivity estimation for XML data with updates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Inferring XML schema definitions from XML data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cardinality estimation for the optimization of queries on ontologies
ACM SIGMOD Record
Accurate histogram-based XML summarization
Proceedings of the 2008 ACM symposium on Applied computing
Learning deterministic regular expressions for the inference of schemas from XML data
Proceedings of the 17th international conference on World Wide Web
UserMap: an adaptive enhancing of user-driven XML-to-relational mapping strategies
ADC '08 Proceedings of the nineteenth conference on Australasian database - Volume 75
A relational model for XML structural joins and their size estimations
Knowledge and Information Systems
XSelMark: A Micro-benchmark for Selectivity Estimation Approaches of XML Queries
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Enabling XPath Optional Axes Cardinality Estimation Using Path Synopses
ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
EXsum: an XML summarization framework
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
A sampling approach for XML query selectivity estimation
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Xoom: a tool for zooming in and out of XML documents
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Refining Keyword Queries for XML Retrieval by Combining Content and Structure
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
ROX: run-time optimization of XQueries
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Simplifying XML schema: effortless handling of nondeterministic regular expressions
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Synopsis based load shedding in XML streams
Proceedings of the 2009 EDBT/ICDT Workshops
XQuery speedup by deploying structural redundancy in mapping XML into relations
Information and Software Technology
Statistics-based parallelization of XPath queries in shared memory systems
Proceedings of the 13th International Conference on Extending Database Technology
LCA-based selection for XML document collections
Proceedings of the 19th international conference on World wide web
Adaptability in XML-to-relational mapping strategies
Proceedings of the 2010 ACM Symposium on Applied Computing
Learning Deterministic Regular Expressions for the Inference of Schemas from XML Data
ACM Transactions on the Web (TWEB)
Towards a comprehensive assessment for selectivity estimation approaches of XML queries
International Journal of Web Engineering and Technology
Ambiguous content and disambiguation of XML schemata
Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Generation of synthetic XML for evaluation of hybrid XML systems
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Holistic schema mappings for XML-on-RDBMS
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
A decomposition-based probabilistic framework for estimating the selectivity of XML twig queries
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
What's next in XML and databases?
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Applying cosine series to XML structural join size estimation
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
A quantitative summary of XML structures
ER'06 Proceedings of the 25th international conference on Conceptual Modeling
Top-K data source selection for keyword queries over multiple XML data sources
Journal of Information Science
Efficiency frontiers of XML cardinality constraints
Data & Knowledge Engineering
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
The availability of summary data for XML documents has many applications, from providing users with quick feedback about their queries, to cost-based storage design and query optimization. StatiX is a novel XML Schema-aware statistics framework that exploits the structure derived by regular expressions (which define elements in an XML Schema) to pinpoint places in the schema that are likely sources of structural skew. As we discuss below, this information can be used to build concise, yet accurate, statistical summaries for XML data. StatiX leverages standard XML technology for gathering statistics, notably XML Schema validators, and it uses histograms to summarize both the structure and values in an XML document. In this paper we describe the StatiX system. We develop algorithms that decompose schemas to obtain statistics at different granularities and discuss how statistics can be gathered as documents are validated. We also present an experimental evaluation which demonstrates the accuracy and scalability of our approach and show an application of these statistics to cost-based XML storage design.