Compressed Index for Dictionary Matching

Authors:
Wing-Kai Hon;Tak-Wah Lam;Rahul Shah;Siu-Lung Tam;Jeffrey Scott Vitter
Affiliations:
-;-;-;-;-
Venue:
DCC '08 Proceedings of the Data Compression Conference
Year:
2008

Citing 0
Cited 13

Succinct Text Indexing with Wildcards

SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
On Entropy-Compressed Text Indexing in External Memory

SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Succinct Index for Dynamic Dictionary Matching

ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Succinct dictionary matching with no slowdown

CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Compression, indexing, and retrieval for massive string data

CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Data structures: time, I/Os, entropy, joules!

ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Faster compressed dictionary matching

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Succinct 2D dictionary matching with no slowdown

WADS'11 Proceedings of the 12th international conference on Algorithms and data structures
Compressed text indexing with wildcards

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Succinct indexes for circular patterns

ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Efficient algorithm for circular burrows-wheeler transform

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Faster compressed dictionary matching

Theoretical Computer Science
Compressed text indexing with wildcards

Journal of Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

The past few years have witnessed several exciting results on compressed representation of a string T that supports efficient pattern matching, and the space complexity has been reduced to |T|H_k(T) + o(|T| log s) bits, where H_k(T) denotes the kth-order empirical entropy of T, and s is the size of the alphabet. In this paper, we study compressed representation of another classical problem of string indexing, which is called dictionary matching in the literature. Precisely, a collection D of strings (called patterns) of total length n is to be indexed so that given a text T, the occurrences of the patterns in T can be found efficiently. In this paper we show how to exploit a sampling technique to compress the existing O(n)-word index to an (nH_k(D) + o(n log s))-bit index with only a small sacrifice in search time.