AQAX: a system for approximate XML query answers
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
XSKETCH synopses for XML data graphs
ACM Transactions on Database Systems (TODS)
Query biased snippet generation in XML search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Temporal XML: modeling, indexing, and query processing
The VLDB Journal — The International Journal on Very Large Data Bases
XSelMark: A Micro-benchmark for Selectivity Estimation Approaches of XML Queries
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
TuG synopses for approximate query answering
ACM Transactions on Database Systems (TODS)
Improving XML search by generating and utilizing informative result snippets
ACM Transactions on Database Systems (TODS)
Exploring XML web collections with DescribeX
ACM Transactions on the Web (TWEB)
Towards a comprehensive assessment for selectivity estimation approaches of XML queries
International Journal of Web Engineering and Technology
Generation of synthetic XML for evaluation of hybrid XML systems
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Optimizing incremental maintenance of minimal bisimulation of cyclic graphs
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Index vs. navigation in XPath evaluation
XSym'06 Proceedings of the 4th international conference on Database and XML Technologies
Using Bayesian networks theory for aggregated search to XML retrieval
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Possibilistic model for aggregated search in XML documents
International Journal of Intelligent Information and Database Systems
Fast answering of XPath query workloads on web collections
XSym'07 Proceedings of the 5th international conference on Database and XML Technologies
Locating and ranking XML documents based on content and structure synopses
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.01 |
We tackle the difficult problem of summarizing the path/branching structure and value content of an XML database that comprises both numeric and textual values. We introduce a novel XML-summarization model, termed XCLUSTERs, that enables accurate selectivity estimates for the class of twig queries with numeric-range, substring, and textual IR predicates over the content of XML elements. In a nutshell, an XCLUSTER synopsis represents an effective clustering of XML elements based on both their structural and value-based characteristics. By leveraging techniques for summarizing XML-document structure as well as numeric and textual data distributions, our XCLUSTER model provides the first known unified framework for handling path/branching structure and different types of element values. We detail the XCLUSTER model, and develop a systematic framework for the construction of effective XCLUSTER summaries within a specified storage budget. Experimental results on synthetic and real-life data verify the effectiveness of our XCLUSTER synopses, clearly demonstrating their ability to accurately summarize XML databases with mixed-value content. To the best of our knowledge, ours is the first work to address the summarization problem for structured XML content in its full generality.