Compressed Text Indexes with Fast Locate

Authors:
Rodrigo González;Gonzalo Navarro
Affiliations:
Dept. of Computer Science, University of, Chile;Dept. of Computer Science, University of, Chile
Venue:
CPM '07 Proceedings of the 18th annual symposium on Combinatorial Pattern Matching
Year:
2007

Citing 0
Cited 23

An(other) Entropy-Bounded Compressed Suffix Tree

CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Compressed text indexes: From theory to practice

Journal of Experimental Algorithmics (JEA)
Speeding Up Pattern Matching by Text Sampling

SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Reducing Space Requirements for Disk Resident Suffix Arrays

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Faster entropy-bounded compressed suffix trees

Theoretical Computer Science
A fast and compact web graph representation

SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Fast and Compact Web Graph Representations

ACM Transactions on the Web (TWEB)
A web search engine model based on index-query bit-level compression

Proceedings of the 1st International Conference on Intelligent Semantic Web-Services and Applications
Engineering basic algorithms of an in-memory text search engine

ACM Transactions on Information Systems (TOIS)
Colored range queries and document retrieval

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Practical compressed document retrieval

SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Indexes for highly repetitive document collections

Proceedings of the 20th ACM international conference on Information and knowledge management
Practical compressed suffix trees

SEA'10 Proceedings of the 9th international conference on Experimental Algorithms
String matching with alphabet sampling

Journal of Discrete Algorithms
Extended compact web graph representations

Algorithms and Applications
Faster approximate pattern matching in compressed repetitive texts

ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Wavelet trees for all

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
A Lempel-Ziv text index on secondary storage

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Compressed suffix trees for repetitive texts

SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Colored range queries and document retrieval

Theoretical Computer Science
Dynamic compressed strings with random access

ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part I
Spaces, Trees, and Colors: The algorithmic landscape of document retrieval on sequences

ACM Computing Surveys (CSUR)
Wavelet trees for all

Journal of Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Compressed text (self-)indexes have matured up to a point where they can replace a text by a data structure that requires less space and, in addition to giving access to arbitrary text passages, support indexed text searches. At this point those indexes are competitive with traditional text indexes (which are very large) for counting the number of occurrences of a pattern in the text. Yet, they are still hundreds to thousands of times slower when it comes to locating those occurrences in the text. In this paper we introduce a new compression scheme for suffix arrays which permits locating the occurrences extremely fast, while still being much smaller than classical indexes. In addition, our index permits a very efficient secondary memory implementation, where compression permits reducing the amount of I/O needed to answer queries.