Elements of information theory
Elements of information theory
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Algorithms and programming models for efficient representation of XML for Internet applications
Proceedings of the 10th international conference on World Wide Web
Validating streaming XML documents
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
IEEE Transactions on Knowledge and Data Engineering
Block-Oriented Compression Techniques for Large Statistical Databases
IEEE Transactions on Knowledge and Data Engineering
Data Compression Support in Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
XPRESS: a queriable compression for XML data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Compressing XML with Multiplexed Hierarchical PPM Models
DCC '01 Proceedings of the Data Compression Conference
XGRIND: A Query-Friendly XML Compressor
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Comparative Analysis of XML Compression Technologies
World Wide Web
XCQ: A queriable XML compression system
Knowledge and Information Systems
Design of a signature file method that accounts for non-uniform occurrence and query frequencies
VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
Path queries on compressed XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
An efficient co-operative framework for multi-query processing over compressed XML data
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Indexing dataspaces with partitions
World Wide Web
Hi-index | 0.00 |
We propose a novel partition path-based (PPB) grouping strategy to store compressed XML data in a stream of blocks. In addition, we employ a minimal indexing scheme called block statistic signature (BSS) on the compressed data, which is a simple but effective technique to support evaluation of selection and aggregate XPath queries of the compressed data. We present a formal analysis and empirical study of these techniques. The BSS indexing is first extended into effective cluster statistic signature (CSS) and multiple-cluster statistic signature (MSS) indexing by establishing more layers of indexes. We analyze how the response time is affected by various parameters involved in our compression strategy such as the data stream block size, the number of cluster layers, and the query selectivity. We also gain further insight about the compression and querying performance by studying the optimal block size in a stream, which leads to the minimum processing cost for queries. The cost model analysis provides a solid foundation for predicting the querying performance. Finally, we demonstrate that our PPB grouping and indexing strategies are not only efficient enough to support path-based selection and aggregate queries of the compressed XML data, but they also require relatively low computation time and storage space when compared with other state-of-the-art compression strategies.