Lempel-Ziv Compression of Structured Text

Authors:
Joaquín Adiego;Gonzalo Navarro;Pablo de la Fuente
Affiliations:
-;-;-
Venue:
DCC '04 Proceedings of the Conference on Data Compression
Year:
2004

Citing 0
Cited 8

Revisiting dictionary-based compression: Research Articles

Software—Practice & Experience
Compressing and searching XML data via two zips

Proceedings of the 15th international conference on World Wide Web
Using structural contexts to compress semistructured text collections

Information Processing and Management: an International Journal
Partial retrieval of compressed semi-structured documents

International Journal of Computer Applications in Technology
Searchable compression of office documents by XML schema subtraction

XSym'10 Proceedings of the 7th international XML database conference on Database and XML technologies
Updates on grammar-compressed XML data

BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Data management for mobile Ajax web 2.0 applications

DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Fast multi-update operations on compressed XML data

BNCOD'13 Proceedings of the 29th British National conference on Big Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a novel Lempel-Ziv approach suitable for compressing structureddocuments, called LZCS, which takes advantage of redundant informationthat can appear in the structure.The main idea is that frequently repeatedsubtrees may exist and these can be replaced by a backward reference to theirfirst occurence.The main advantage is that compressed documents generatedby LZCS are easy to display, access at random, and navigate.In a secondstage, processed documents can be further compressed using some semiadaptivetechnique, so that random access and navigability remain possible.LZCSis especially efficient to compress collections of highly structured data, such asXML forms, invoices, e-commerce and web-service exchange documents.Thecomparison against structure-based and standard compressors shows that LZCSis a competitive choice for this type of documents, while the others are not well-suitedto support navigation or random access.