Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
Data compression in full-text retrieval systems
Journal of the American Society for Information Science
Data compression: the complete reference
Data compression: the complete reference
Introduction to data compression
Introduction to data compression
Managing Gigabytes: Compressing and Indexing Documents and Images
Managing Gigabytes: Compressing and Indexing Documents and Images
Analysis of Algorithms and Data and Structures
Analysis of Algorithms and Data and Structures
Word-Based Compression Methods and Indexing for Text Retrieval Systems
ADBIS '99 Proceedings of the Third East European Conference on Advances in Databases and Information Systems
Word-Based Compression Methods for Large Text Documents
DCC '99 Proceedings of the Conference on Data Compression
Word-Based Compression Methods and Indexing for Text Retrieval Systems
ADBIS '99 Proceedings of the Third East European Conference on Advances in Databases and Information Systems
Combining Structural and Textual Contexts for Compressing Semistructured Databases
ENC '05 Proceedings of the Sixth Mexican International Conference on Computer Science
Compression of small text files
Advanced Engineering Informatics
Natural Language Compression on Edge-Guided text preprocessing
Information Sciences: an International Journal
Mapping words into codewords on PPM
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
In this article we present a new compression method, called WLZW, which is a word-based modification of classic LZW. The modification is similar to the approach used in the HuffWord compression algorithm. The algorithm is two-phase, the compression ratio achieved is fairly good, on average 22%-20% (see [2],[3]). Moreover, the table of words, which is side product of compression, can be used to create full-text index, especially for dynamic text databases. Overhead of the index is good.