Data compression with finite windows
Communications of the ACM
Software—Practice & Experience
Text compression
A very fast substring search algorithm
Communications of the ACM
Text algorithms
Let sleeping files lie: pattern matching in Z-compressed files
Journal of Computer and System Sciences
A text compression scheme that allows fast searching directly in the compressed file
ACM Transactions on Information Systems (TOIS)
String matching in the DNA alphabet
Software—Practice & Experience
Fast and flexible word searching on compressed text
ACM Transactions on Information Systems (TOIS)
A fast string searching algorithm
Communications of the ACM
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Fast and flexible string matching by combining bit-parallelism and suffix automata
Journal of Experimental Algorithmics (JEA)
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
Efficient Algorithms for Lempel-Zip Encoding (Extended Abstract)
SWAT '96 Proceedings of the 5th Scandinavian Workshop on Algorithm Theory
A String Matching Algorithm Fast on the Average
Proceedings of the 6th Colloquium, on Automata, Languages and Programming
Efficient Experimental String Matching by Weak Factor Recognition
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
A New Regular Grammar Pattern Matching Algorithm
ESA '96 Proceedings of the Fourth Annual European Symposium on Algorithms
A New Compression Method for Compressed Matching
DCC '00 Proceedings of the Conference on Data Compression
A Unifying Framework for Compressed Pattern Matching
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Multiple Pattern Matching in LZW Compressed Text
DCC '98 Proceedings of the Conference on Data Compression
Pattern Matching in Huffman Encoded Texts
DCC '01 Proceedings of the Data Compression Conference
Faster Approximate String Matching over Compressed Text
DCC '01 Proceedings of the Data Compression Conference
Regular expression searching on compressed text
Journal of Discrete Algorithms
Approximate string matching on Ziv-Lempel compressed text
Journal of Discrete Algorithms
Tuning string matching for huge pattern sets
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
New adaptive compressors for natural language text
Software—Practice & Experience
Manipulating lossless video in the compressed domain
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Dynamic lightweight text compression
ACM Transactions on Information Systems (TOIS)
DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
Fast matching method for DNA sequences
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
Compressing IP forwarding tables: towards entropy bounds and beyond
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Hi-index | 0.00 |
We present a Boyer–Moore (BM) approach to string matching over LZ78 and LZW compressed text. The idea is to search the text directly in compressed form instead of decompressing and then searching it. We modify the BM approach so as to skip text using the characters explicitly represented in the LZ78/LZW formats, modifying the basic technique where the algorithm can choose which characters to inspect. We present and compare several solutions for single and multipattern searches. We show that our algorithms obtain speedups of up to 50% compared to the simple decompress-then-search approach. Finally, we present a public tool, LZgrep, which uses our algorithms to offer grep-like capabilities directly searching files compressed using Unix's Compress, a LZW compressor. LZgrep can also search files compressed with Unix gzip, using new decompress-then-search techniques we develop, which are faster than the current tools. This way, users can always keep their files in compressed form and still search them, uncompressing only when they want to see them. Copyright © 2005 John Wiley & Sons, Ltd.