Journal of Algorithms
A locally adaptive data compression scheme
Communications of the ACM
Software—Practice & Experience
Fast text searching: allowing errors
Communications of the ACM
Fast and flexible word searching on compressed text
ACM Transactions on Information Systems (TOIS)
A fast string searching algorithm
Communications of the ACM
Information Retrieval: Computational and Theoretical Aspects
Information Retrieval: Computational and Theoretical Aspects
Compression and Coding Algorithms
Compression and Coding Algorithms
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Factor Oracle: A New Structure for Pattern Matching
SOFSEM '99 Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics
Efficiently decodable and searchable natural language adaptive compression
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
LZgrep: a Boyer–Moore string matching tool for Ziv–Lempel compressed text: Research Articles
Software—Practice & Experience
Lightweight natural language text compression
Information Retrieval
Enhanced byte codes with restricted prefix properties
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Dynamic lightweight text compression
ACM Transactions on Information Systems (TOIS)
Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections
Proceedings of the VLDB Endowment
ODC: Frame for definition of Dense codes
European Journal of Combinatorics
Hi-index | 0.00 |
Semistatic byte-oriented word-based compression codes have been shown to be an attractive alternative to compress natural language text databases, because of the combination of speed, effectiveness, and direct searchability they offer. In particular, our recently proposed family of dense compression codes has been shown to be superior to the more traditional byte-oriented word-based Huffman codes in most aspects. In this paper, we focus on the problem of transmitting texts among peers that do not share the vocabulary. This is the typical scenario for adaptive compression methods. We design adaptive variants of our semistatic dense codes, showing that they are much simpler and faster than dynamic Huffman codes and reach almost the same compression effectiveness. We show that our variants have a very compelling trade-off between compression-decompression speed, compression ratio, and search speed compared with most of the state-of-the-art general compressors. Copyright © 2008 John Wiley & Sons, Ltd. A preliminary partial version on this work appeared in [1]