New indices for text: PAT Trees and PAT arrays
Information retrieval
Bit-Tree: a data structure for fast file processing
Communications of the ACM
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
An experimental study of an opportunistic index
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Journal of Algorithms
Time-space trade-offs for compressed suffix arrays
Information Processing Letters
Succinct representations of lcp information and improvements in the compressed suffix arrays
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Low Redundancy in Static Dictionaries with Constant Query Time
SIAM Journal on Computing
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
A suboptimal lossy data compression based on approximate pattern matching
IEEE Transactions on Information Theory
Compact suffix array: a space-efficient full-text index
Fundamenta Informaticae - Special issue on computing patterns in strings
When indexing equals compression: experiments with compressing suffix arrays and applications
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
A categorization theorem on suffix arrays with applications to space efficient text indexes
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Journal of the ACM (JACM)
The Indiana Center for Database Systems at Purdue University
ACM SIGMOD Record
Structuring labeled trees for optimal succinctness, and beyond
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Rank/select operations on large alphabets: a tool for text indexing
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Squeezing succinct data structures into entropy bounds
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Succinct suffix arrays based on run-length encoding
Nordic Journal of Computing
When indexing equals compression: Experiments with compressing suffix arrays and applications
ACM Transactions on Algorithms (TALG)
ACM Computing Surveys (CSUR)
Note: A simple storage scheme for strings achieving entropy bounds
Theoretical Computer Science
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Compressed indexes for dynamic text collections
ACM Transactions on Algorithms (TALG)
The cell probe complexity of succinct data structures
Theoretical Computer Science
The engineering of a compression boosting library: theory vs practice in BWT compression
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Ultra-succinct representation of ordered trees
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct indexes for strings, binary relations and multi-labeled trees
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A simple storage scheme for strings achieving entropy bounds
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets
ACM Transactions on Algorithms (TALG)
A simpler analysis of Burrows–Wheeler-based compression
Theoretical Computer Science
Adaptive searching in succinctly encoded binary relations and tree-structured documents
Theoretical Computer Science
Compressed data structures: Dictionaries and data-aware measures
Theoretical Computer Science
Rank and select revisited and extended
Theoretical Computer Science
The SBC-tree: an index for run-length compressed sequences
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Dynamic entropy-compressed sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Output-sensitive autocompletion search
Information Retrieval
A compressed self-index using a Ziv---Lempel dictionary
Information Retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms and data structures for external memory
Foundations and Trends® in Theoretical Computer Science
An(other) Entropy-Bounded Compressed Suffix Tree
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
On Compact Representations of All-Pairs-Shortest-Path-Distance Matrices
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
On the Redundancy of Succinct Data Structures
SWAT '08 Proceedings of the 11th Scandinavian workshop on Algorithm Theory
Succinct backward-DAWG-matching
Journal of Experimental Algorithmics (JEA)
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
Self-indexing Natural Language
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Practical Rank/Select Queries over Arbitrary Sequences
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cell probe lower bounds for succinct data structures
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Succinct geometric indexes supporting point location queries
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Rank and Select for Succinct Data Structures
Electronic Notes in Theoretical Computer Science (ENTCS)
The myriad virtues of Wavelet Trees
Information and Computation
Engineering a compressed suffix tree implementation
Journal of Experimental Algorithmics (JEA)
Efficient Data Structures for the Orthogonal Range Successor Problem
COCOON '09 Proceedings of the 15th Annual International Conference on Computing and Combinatorics
Succinct Orthogonal Range Search Structures on a Grid with Applications to Text Indexing
WADS '09 Proceedings of the 11th International Symposium on Algorithms and Data Structures
Dynamic rank/select structures with applications to run-length encoded texts
Theoretical Computer Science
Rank/select on dynamic compressed sequences and applications
Theoretical Computer Science
Range Quantile Queries: Another Virtue of Wavelet Trees
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
On Entropy-Compressed Text Indexing in External Memory
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Faster entropy-bounded compressed suffix trees
Theoretical Computer Science
A New Point Access Method Based on Wavelet Trees
ER '09 Proceedings of the ER 2009 Workshops (CoMoL, ETheCoM, FP-UML, MOST-ONISW, QoIS, RIGiM, SeCoGIS) on Advances in Conceptual Modeling - Challenging Perspectives
SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
Information Processing Letters
The cell probe complexity of succinct data structures
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Engineering a compressed suffix tree implementation
WEA'07 Proceedings of the 6th international conference on Experimental algorithms
On the size of succinct indices
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Implicit compression boosting with applications to self-indexing
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Improved dynamic rank-select entropy-bound structures
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Move-to-Front, Distance Coding, and Inversion Frequencies revisited
Theoretical Computer Science
Note: On compact representations of All-Pairs-Shortest-Path-Distance matrices
Theoretical Computer Science
Fast and Compact Web Graph Representations
ACM Transactions on the Web (TWEB)
Index structures for efficiently searching natural language text
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A web search engine model based on index-query bit-level compression
Proceedings of the 1st International Conference on Intelligent Semantic Web-Services and Applications
A fun application of compact data structures to indexing geographic data
FUN'10 Proceedings of the 5th international conference on Fun with algorithms
On table arrangements, scrabble freaks, and jumbled pattern matching
FUN'10 Proceedings of the 5th international conference on Fun with algorithms
Bidirectional search in a string with wavelet trees
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Top-k ranked document search in general text databases
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Practical approaches to reduce the space requirement of lempel-ziv--based compressed text indices
Journal of Experimental Algorithmics (JEA)
Medium-space algorithms for inverse BWT
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Improved data structures for the orthogonal range successor problem
Computational Geometry: Theory and Applications
Compressed self-indices supporting conjunctive queries on document collections
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
String retrieval for multi-pattern queries
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Colored range queries and document retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Multiplication algorithms for Monge matrices
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Succinct representations of dynamic strings
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Computing matching statistics and maximal exact matches on compressed full-text indexes
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
The gapped suffix array: a new index structure for fast approximate matching
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Space-efficient construction of Lempel-Ziv compressed text indexes
Information and Computation
A quick tour on suffix arrays and compressed suffix arrays
Theoretical Computer Science
A query-friendly compression for GML documents
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
Succinct indexes for strings, binary relations and multilabeled trees
ACM Transactions on Algorithms (TALG)
ACM Transactions on Algorithms (TALG)
Compressed string dictionaries
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Practical compressed document retrieval
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Inverted indexes for phrases and strings
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Lempel-Ziv factorization revisited
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Succincter text indexing with wildcards
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Counting colours in compressed strings
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
LRM-trees: compressed indices, adaptive sorting, and compressed permutations
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Range majority in constant time and linear space
ICALP'11 Proceedings of the 38th international colloquim conference on Automata, languages and programming - Volume Part I
Compact navigation and distance oracles for graphs with small treewidth
ICALP'11 Proceedings of the 38th international colloquim conference on Automata, languages and programming - Volume Part I
Compressed directed acyclic word graph with application in local alignment
COCOON'11 Proceedings of the 17th annual international conference on Computing and combinatorics
Alphabet-independent compressed text indexing
ESA'11 Proceedings of the 19th European conference on Algorithms
Distribution-aware compressed full-text indexes
ESA'11 Proceedings of the 19th European conference on Algorithms
Fixed block compression boosting in FM-indexes
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Space efficient wavelet tree construction
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Computing the longest common prefix array based on the burrows-wheeler transform
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
A succinct index for hypertext
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Finding frequent elements in compressed 2D arrays and strings
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Improved compressed indexes for full-text document retrieval
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Compressed indexes for aligned pattern matching
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Practical representations for web and social graphs
Proceedings of the 20th ACM international conference on Information and knowledge management
Space-Efficient Preprocessing Schemes for Range Minimum Queries on Static Arrays
SIAM Journal on Computing
Word-based self-indexes for natural language text
ACM Transactions on Information Systems (TOIS)
Top-k document retrieval in optimal time and linear space
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Statistical encoding of succinct data structures
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Dynamic entropy-compressed sequences and full-text indexes
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Reducing the space requirement of LZ-Index
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Ultra-succinct representation of ordered trees with applications
Journal of Computer and System Sciences
Efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Tree
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Compact rich-functional binary relation representations
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Space-efficient construction of LZ-index
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Top-K color queries for document retrieval
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Succinct suffix arrays based on run-length encoding
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
The myriad virtues of wavelet trees
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Succinct geometric indexes supporting point location queries
ACM Transactions on Algorithms (TALG)
Efficient implementation of rank and select functions for succinct representation
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
New algorithms on wavelet trees and applications to information retrieval
Theoretical Computer Science
Bidirectional search in a string with wavelet trees and bidirectional matching statistics
Information and Computation
Extended compact web graph representations
Algorithms and Applications
Position-Restricted substring searching
LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics
Path queries in weighted trees
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Space-efficient data-analysis queries on grids
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Succinct indexes for circular patterns
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Improved algorithms for the range next value problem and applications
Theoretical Computer Science
The wavelet trie: maintaining an indexed sequence of strings in compressed space
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
A faster grammar-based self-index
LATA'12 Proceedings of the 6th international conference on Language and Automata Theory and Applications
Distributed search based on self-indexed compressed text
Information Processing and Management: an International Journal
Space-efficient multiple string matching automata
International Journal of Wireless and Mobile Computing
Exchange and consumption of huge RDF data
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Efficient in-memory top-k document retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
To index or not to index: time-space trade-offs in search engines with positional ranking functions
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
CRAM: compressed random access memory
ICALP'12 Proceedings of the 39th international colloquium conference on Automata, Languages, and Programming - Volume Part I
Self-Indexed Grammar-Based Compression
Fundamenta Informaticae
Fast, small, simple rank/select on bitmaps
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
Space-Efficient top-k document retrieval
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Towards an optimal space-and-query-time index for top-k document retrieval
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Document listing for queries with excluded pattern
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Computing the burrows-wheeler transform of a string and its reverse
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
SWAT'12 Proceedings of the 13th Scandinavian conference on Algorithm Theory
Compact Suffix Array — A Space-Efficient Full-Text Index
Fundamenta Informaticae - Computing Patterns in Strings
LRM-Trees: Compressed indices, adaptive sorting, and compressed permutations
Theoretical Computer Science
On enumerating the DNA sequences
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Dynamic rank-select structures with applications to run-length encoded texts
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Compressed text indexes with fast locate
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
A framework for dynamizing succinct data structures
ICALP'07 Proceedings of the 34th international conference on Automata, Languages and Programming
Compressed data structures with relevance
Proceedings of the 21st ACM international conference on Information and knowledge management
DACs: Bringing direct access to variable-length codes
Information Processing and Management: an International Journal
New lower and upper bounds for representing sequences
ESA'12 Proceedings of the 20th Annual European conference on Algorithms
Efficient indexing algorithms for approximate pattern matching in text
Proceedings of the Seventeenth Australasian Document Computing Symposium
Exploiting SIMD instructions in current processors to improve classical string algorithms
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Range majority in constant time and linear space
Information and Computation
Space-Efficient computation of maximal and supermaximal repeats in genome sequences
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Improved grammar-based compressed indexes
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Compressed representation of web and social networks via dense subgraphs
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Dual-Sorted inverted lists in practice
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Smaller self-indexes for natural language
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Variable-Length codes for space-efficient grammar-based compression
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Succinct representations of weighted trees supporting path queries
Journal of Discrete Algorithms
Implicit indexing of natural language text by reorganizing bytecodes
Information Retrieval
Improved compressed indexes for full-text document retrieval
Journal of Discrete Algorithms
Computing the longest common prefix array based on the Burrows-Wheeler transform
Journal of Discrete Algorithms
Approximate string matching by position restricted alignment
Proceedings of the Joint EDBT/ICDT 2013 Workshops
On compressing and indexing repetitive sequences
Theoretical Computer Science
Compressed indexes for text with wildcards
Theoretical Computer Science
Colored range queries and document retrieval
Theoretical Computer Science
Space-efficient data-analysis queries on grids
Theoretical Computer Science
Faster and smaller inverted indices with treaps
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Compressed persistent index for efficient rank/select queries
WADS'13 Proceedings of the 13th international conference on Algorithms and Data Structures
Spaces, Trees, and Colors: The algorithmic landscape of document retrieval on sequences
ACM Computing Surveys (CSUR)
Space efficient data structures for dynamic orthogonal range counting
Computational Geometry: Theory and Applications
On compressing permutations and adaptive sorting
Theoretical Computer Science
Compact binary relation representations with rich functionality
Information and Computation
Cross-document pattern matching
Journal of Discrete Algorithms
Journal of Discrete Algorithms
Computing the Burrows-Wheeler transform of a string and its reverse in parallel
Journal of Discrete Algorithms
Hi-index | 0.02 |
We present a novel implementation of compressed suffix arrays exhibiting new tradeoffs between search time and space occupancy for a given text (or sequence) of n symbols over an alphabet σ, where each symbol is encoded by lg|σ| bits. We show that compressed suffix arrays use just nHh + σ bits, while retaining full text indexing functionalities, such as searching any pattern sequence of length m in O(m lg |σ| + polylog(n)) time. The term Hh ≤ lg |σ| denotes the hth-order empirical entropy of the text, which means that our index is nearly optimal in space apart from lower-order terms, achieving asymptotically the empirical entropy of the text (with a multiplicative constant 1). If the text is highly compressible so that Hn = o(1) and the alphabet size is small, we obtain a text index with o(m) search time that requires only o(n) bits. Further results and tradeoffs are reported in the paper.