XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Fast and flexible word searching on compressed text
ACM Transactions on Information Systems (TOIS)
An experimental study of a compressed index
Information Sciences: an International Journal - Dictionary based compression
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
ViST: a dynamic index method for querying XML data by tree structures
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XPRESS: a queriable compression for XML data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
DCC '02 Proceedings of the Data Compression Conference
Compressing XML with Multiplexed Hierarchical PPM Models
DCC '01 Proceedings of the Data Compression Conference
XGRIND: A Query-Friendly XML Compressor
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Merging Prediction by Partial Matching with Structural Contexts Model
DCC '04 Proceedings of the Conference on Data Compression
Lempel-Ziv Compression of Structured Text
DCC '04 Proceedings of the Conference on Data Compression
PRIX: Indexing And Querying XML Using Prüfer Sequences
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Succinct ordinal trees with level-ancestor queries
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
On the integration of structure indexes and inverted lists
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On boosting holism in XML twig pattern matching using structural indexing techniques
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Journal of the ACM (JACM)
Efficient processing of XML path queries using the disk-based F&B Index
VLDB '05 Proceedings of the 31st international conference on Very large data bases
XML Document Indexes: A Classification
IEEE Internet Computing
Structuring labeled trees for optimal succinctness, and beyond
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Representing Trees of Higher Degree
Algorithmica
Rank/select operations on large alphabets: a tool for text indexing
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
XQueC: pushing queries to compressed XML data
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Querying and maintaining a compact XML storage
Proceedings of the 16th international conference on World Wide Web
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
On searching compressed string collections cache-obliviously
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Effective asymmetric XML compression
Software—Practice & Experience
XML Storage and Processing on Mobile Devices
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
Information Systems
XML compression techniques: A survey and comparison
Journal of Computer and System Sciences
Compressing and indexing labeled trees, with applications
Journal of the ACM (JACM)
Efficient indexing of versioned document sequences
ECIR'07 Proceedings of the 29th European conference on IR research
Combining efficient XML compression with query processing
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
A highly efficient XML compression scheme for the web
SOFSEM'08 Proceedings of the 34th conference on Current trends in theory and practice of computer science
Data structures: time, I/Os, entropy, joules!
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Spatio-temporal range searching over compressed kinetic sensor data
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Statistical encoding of succinct data structures
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Searching web data: An entity retrieval and high-performance indexing model
Web Semantics: Science, Services and Agents on the World Wide Web
A resource efficient hybrid data structure for twig queries
XSym'06 Proceedings of the 4th international conference on Database and XML Technologies
A compact XML storage scheme supporting efficient path querying
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Full-text search on multi-byte encoded documents
Proceedings of the 2012 ACM symposium on Document engineering
Schema Independent XML Compressor
International Journal of Information Retrieval Research
Hi-index | 0.00 |
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML representation of a document is significantly larger than its native state) and the complexity of its search (XML search involves path and content searches on labeled tree structures). We address the basic problems of compression, navigation and searching of XML documents. In particular, we adopt recently proposed theoretical algorithms [11] for succinct tree representations to design and implement a compressed index for XML, called XBZIPiNDEX, in which the XML document is maintained in a highly compressed format, and both navigation and searching can be done uncompressing only a tiny fraction of the data. This solution relies on compressing and indexing two arrays derived from the XML data. With detailed experiments we compare this with other compressed XML indexing and searching engines to show that XBZIPiNDEX has compression ratio up to 35% better than the ones achievable by those other tools, and its time performance on some path and content search operations is order of magnitudes faster: few milliseconds over hundreds of MBs of XML files versus tens of seconds, on standard XML data sources.