Text compression
Efficient implementation of suffix trees
Software—Practice & Experience
An introduction to the analysis of algorithms
An introduction to the analysis of algorithms
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Compact pat trees
The string B-tree: a new data structure for string search in external memory and its applications
Journal of the ACM (JACM)
Efficient suffix trees on secondary storage
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Compression of Low Entropy Strings with Lempel--Ziv Algorithms
SIAM Journal on Computing
On the sorting-complexity of suffix tree construction
Journal of the ACM (JACM)
An experimental study of an opportunistic index
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Journal of Algorithms
An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Time-space trade-offs for compressed suffix arrays
Information Processing Letters
Succinct representations of lcp information and improvements in the compressed suffix arrays
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Modern Information Retrieval
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Low Redundancy in Static Dictionaries with O(1) Worst Case Lookup Time
ICAL '99 Proceedings of the 26th International Colloquium on Automata, Languages and Programming
Indexing Text Using the Ziv-Lempel Trie
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
COCOON '96 Proceedings of the Second Annual International Conference on Computing and Combinatorics
A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays
COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
Direct Construction of Compact Directed Acyclic Word Graphs
CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
Priority Queues: Small, Monotone and Trans-dichotomous
ESA '96 Proceedings of the Fourth Annual European Symposium on Algorithms
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Succinct representation of balanced parentheses, static trees and planar graphs
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
New data structures for orthogonal range searching
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
An Efficient Method for in Memory Construction of Suffix Arrays
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Breaking a Time-and-Space Barrier in Constructing Full-Text Indices
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Compressed Index for Dynamic Text
DCC '04 Proceedings of the Conference on Data Compression
When indexing equals compression: experiments with compressing suffix arrays and applications
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
Indexing text using the Ziv-Lempel trie
Journal of Discrete Algorithms - SPIRE 2002
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Lower bounds on the size of selection and rank indexes
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
A categorization theorem on suffix arrays with applications to space efficient text indexes
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Journal of the ACM (JACM)
Boosting textual compression in optimal linear time
Journal of the ACM (JACM)
Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
SIAM Journal on Computing
Rank/select operations on large alphabets: a tool for text indexing
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Handbook of Computational Molecular Biology (Chapman & All/Crc Computer and Information Science Series)
Approximate string matching using compressed suffix arrays
Theoretical Computer Science
Large alphabets and incompressibility
Information Processing Letters
Succinct suffix arrays based on run-length encoding
Nordic Journal of Computing
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
A taxonomy of suffix array construction algorithms
ACM Computing Surveys (CSUR)
Simple linear work suffix array construction
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Computational Geometry: Algorithms and Applications
Computational Geometry: Algorithms and Applications
A linear size index for approximate pattern matching
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Obtaining provably good performance from suffix trees in secondary storage
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Dynamic entropy-compressed sequences and full-text indexes
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Improved approximate string matching using compressed suffix data structures
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Space-efficient construction of LZ-index
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Succinct suffix arrays based on run-length encoding
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Optimal lower bounds for rank and select indexes
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Efficient implementation of rank and select functions for succinct representation
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Compact Suffix Array — A Space-Efficient Full-Text Index
Fundamenta Informaticae - Computing Patterns in Strings
Note: A simple storage scheme for strings achieving entropy bounds
Theoretical Computer Science
A taxonomy of suffix array construction algorithms
ACM Computing Surveys (CSUR)
Fast generation of result snippets in web search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets
ACM Transactions on Algorithms (TALG)
Theoretical Computer Science
Fast BWT in small space by blockwise suffix sorting
Theoretical Computer Science
Rank and select revisited and extended
Theoretical Computer Science
The affix array data structure and its applications to RNA secondary structure analysis
Theoretical Computer Science
Dynamic entropy-compressed sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
A compressed self-index using a Ziv---Lempel dictionary
Information Retrieval
Improving suffix array locality for fast pattern matching on disk
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On searching compressed string collections cache-obliviously
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Counting suffix arrays and strings
Theoretical Computer Science
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms and data structures for external memory
Foundations and Trends® in Theoretical Computer Science
Implementing the LZ-index: Theory versus practice
Journal of Experimental Algorithmics (JEA)
Optimal prefix and suffix queries on texts
Information Processing Letters
An(other) Entropy-Bounded Compressed Suffix Tree
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
On Compact Representations of All-Pairs-Shortest-Path-Distance Matrices
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Dynamic Fully-Compressed Suffix Trees
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Proximity Scoring Using Sentence-Based Inverted Index for Practical Full-Text Search
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
An Online Algorithm for Finding the Longest Previous Factors
ESA '08 Proceedings of the 16th annual European symposium on Algorithms
Succinct backward-DAWG-matching
Journal of Experimental Algorithmics (JEA)
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
Speeding Up Pattern Matching by Text Sampling
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Self-indexing Natural Language
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Indexed Hierarchical Approximate String Matching
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Practical Rank/Select Queries over Arbitrary Sequences
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cell probe lower bounds for succinct data structures
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
On the bit-complexity of Lempel-Ziv compression
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Rank and Select for Succinct Data Structures
Electronic Notes in Theoretical Computer Science (ENTCS)
Exploiting web search engines to search structured databases
Proceedings of the 18th international conference on World wide web
Reducing Space Requirements for Disk Resident Suffix Arrays
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
B-tries for disk-based string management
The VLDB Journal — The International Journal on Very Large Data Bases
The myriad virtues of Wavelet Trees
Information and Computation
Simple Random Access Compression
Fundamenta Informaticae
Storage and Retrieval of Individual Genomes
RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Broadword Computing and Fibonacci Code Speed Up Compressed Suffix Arrays
SEA '09 Proceedings of the 8th International Symposium on Experimental Algorithms
Click-through prediction for news queries
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Text Indexing, Suffix Sorting, and Data Compression: Common Problems and Techniques
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Permuted Longest-Common-Prefix Array
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Engineering a compressed suffix tree implementation
Journal of Experimental Algorithmics (JEA)
A fast algorithm for finding the positions of all squares in a run-length encoded string
Theoretical Computer Science
Tera-scale translation models via pattern matching
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Rank/select on dynamic compressed sequences and applications
Theoretical Computer Science
Compressing and indexing labeled trees, with applications
Journal of the ACM (JACM)
Range Quantile Queries: Another Virtue of Wavelet Trees
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Succinct Text Indexing with Wildcards
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
A Compressed Enhanced Suffix Array Supporting Fast String Matching
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Compressed Suffix Arrays for Massive Data
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
On Entropy-Compressed Text Indexing in External Memory
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
A Linear-Time Burrows-Wheeler Transform Using Induced Sorting
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
A Two-Level Structure for Compressing Aligned Bitexts
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Directly Addressable Variable-Length Codes
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Faster entropy-bounded compressed suffix trees
Theoretical Computer Science
Learning Deep Web Crawling with Diverse Features
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A New Point Access Method Based on Wavelet Trees
ER '09 Proceedings of the ER 2009 Workshops (CoMoL, ETheCoM, FP-UML, MOST-ONISW, QoIS, RIGiM, SeCoGIS) on Advances in Conceptual Modeling - Challenging Perspectives
On compressing the textual web
Proceedings of the third ACM international conference on Web search and data mining
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Leveraging personal metadata for Desktop search: The Beagle++ system
Web Semantics: Science, Services and Agents on the World Wide Web
Dynamic extended suffix arrays
Journal of Discrete Algorithms
Information Processing Letters
Engineering a compressed suffix tree implementation
WEA'07 Proceedings of the 6th international conference on Experimental algorithms
A fast and compact web graph representation
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Approximate string matching with Lempel-Ziv compressed indexes
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Improved dynamic rank-select entropy-bound structures
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Move-to-Front, Distance Coding, and Inversion Frequencies revisited
Theoretical Computer Science
Note: On compact representations of All-Pairs-Shortest-Path-Distance matrices
Theoretical Computer Science
Fast and Compact Web Graph Representations
ACM Transactions on the Web (TWEB)
The compressed permuterm index
ACM Transactions on Algorithms (TALG)
Index structures for efficiently searching natural language text
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A fun application of compact data structures to indexing geographic data
FUN'10 Proceedings of the 5th international conference on Fun with algorithms
On table arrangements, scrabble freaks, and jumbled pattern matching
FUN'10 Proceedings of the 5th international conference on Fun with algorithms
Approximate all-pairs suffix/prefix overlaps
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Sampled longest common prefix array
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Parallel and distributed compressed indexes
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Engineering basic algorithms of an in-memory text search engine
ACM Transactions on Information Systems (TOIS)
Data structures: time, I/Os, entropy, joules!
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Top-k ranked document search in general text databases
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Practical approaches to reduce the space requirement of lempel-ziv--based compressed text indices
Journal of Experimental Algorithmics (JEA)
Spatio-temporal range searching over compressed kinetic sensor data
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Medium-space algorithms for inverse BWT
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Compressed self-indices supporting conjunctive queries on document collections
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Colored range queries and document retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Relative Lempel-Ziv compression of genomes for large-scale storage and retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Computing matching statistics and maximal exact matches on compressed full-text indexes
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Note: Combined data structure for previous- and next-smaller-values
Theoretical Computer Science
Space-efficient construction of Lempel-Ziv compressed text indexes
Information and Computation
A quick tour on suffix arrays and compressed suffix arrays
Theoretical Computer Science
Space-efficient substring occurrence estimation
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ACM Transactions on Algorithms (TALG)
Compressed string dictionaries
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Practical compressed document retrieval
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Succincter text indexing with wildcards
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
LRM-trees: compressed indices, adaptive sorting, and compressed permutations
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
DLT'11 Proceedings of the 15th international conference on Developments in language theory
WADS'11 Proceedings of the 12th international conference on Algorithms and data structures
Indexing finite language representation of population genotypes
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Alphabet-independent compressed text indexing
ESA'11 Proceedings of the 19th European conference on Algorithms
Distribution-aware compressed full-text indexes
ESA'11 Proceedings of the 19th European conference on Algorithms
Fixed block compression boosting in FM-indexes
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Space efficient wavelet tree construction
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Computing the longest common prefix array based on the burrows-wheeler transform
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
A succinct index for hypertext
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Fast q-gram mining on SLP compressed strings
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Improved compressed indexes for full-text document retrieval
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
ESP-index: a compressed index based on edit-sensitive parsing
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Compressed indexes for aligned pattern matching
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Indexes for highly repetitive document collections
Proceedings of the 20th ACM international conference on Information and knowledge management
Lightweighting the web of data through compact RDF/HDT
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Space-Efficient Preprocessing Schemes for Range Minimum Queries on Static Arrays
SIAM Journal on Computing
Word-based self-indexes for natural language text
ACM Transactions on Information Systems (TOIS)
Journal of Discrete Algorithms
Efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Tree
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Lightweight data indexing and compression in external memory
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Practical compressed suffix trees
SEA'10 Proceedings of the 9th international conference on Experimental Algorithms
String matching with alphabet sampling
Journal of Discrete Algorithms
New algorithms on wavelet trees and applications to information retrieval
Theoretical Computer Science
Bidirectional search in a string with wavelet trees and bidirectional matching statistics
Information and Computation
Approximate all-pairs suffix/prefix overlaps
Information and Computation
Extended compact web graph representations
Algorithms and Applications
From nondeterministic suffix automaton to lazy suffix tree
Algorithms and Applications
Unified view of backward backtracking in short read mapping
Algorithms and Applications
Binary RDF for scalable publishing, exchanging and consumption in the web of data
Proceedings of the 21st international conference companion on World Wide Web
Computing q-gram non-overlapping frequencies on SLP compressed texts
SOFSEM'12 Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science
A faster grammar-based self-index
LATA'12 Proceedings of the 6th international conference on Language and Automata Theory and Applications
Compression of RDF dictionaries
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Indexed multi-pattern matching
LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics
Memory-Aware BWT by segmenting sequences to support subsequence search
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Querying RDF dictionaries in compressed space
ACM SIGAPP Applied Computing Review
Revisiting bounded context block-sorting transformations
Software—Practice & Experience
Exchange and consumption of huge RDF data
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Efficient in-memory top-k document retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
To index or not to index: time-space trade-offs in search engines with positional ranking functions
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Self-Indexed Grammar-Based Compression
Fundamenta Informaticae
Simple Random Access Compression
Fundamenta Informaticae
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Cross-Document pattern matching
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
LRM-Trees: Compressed indices, adaptive sorting, and compressed permutations
Theoretical Computer Science
Generalized biwords for bitext compression and translation spotting
Journal of Artificial Intelligence Research
Move-to-front, distance coding, and inversion frequencies revisited
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
A Lempel-Ziv text index on secondary storage
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Space-efficient algorithms for document retrieval
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Compressed text indexes with fast locate
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Fast and practical algorithms for computing all the runs in a string
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
A framework for dynamizing succinct data structures
ICALP'07 Proceedings of the 34th international conference on Automata, Languages and Programming
Compressed data structures with relevance
Proceedings of the 21st ACM international conference on Information and knowledge management
A new succinct representation of RMQ-information and improvements in the enhanced suffix array
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
Efficient indexing algorithms for approximate pattern matching in text
Proceedings of the Seventeenth Australasian Document Computing Symposium
Exploiting SIMD instructions in current processors to improve classical string algorithms
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Comparing DNA sequence collections by direct comparison of compressed text indexes
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Compressed suffix trees for repetitive texts
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Improved grammar-based compressed indexes
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Smaller self-indexes for natural language
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
On position restricted substring searching in succinct space
Journal of Discrete Algorithms
Improved compressed indexes for full-text document retrieval
Journal of Discrete Algorithms
Computing the longest common prefix array based on the Burrows-Wheeler transform
Journal of Discrete Algorithms
Fast q-gram mining on SLP compressed strings
Journal of Discrete Algorithms
ESP-index: A compressed index based on edit-sensitive parsing
Journal of Discrete Algorithms
Journal of Discrete Algorithms
Cache-aware parallel approximate matching and join algorithms using BWT
Proceedings of the Joint EDBT/ICDT 2013 Workshops
On compressing and indexing repetitive sequences
Theoretical Computer Science
Compressed indexes for text with wildcards
Theoretical Computer Science
Colored range queries and document retrieval
Theoretical Computer Science
Trends in suffix sorting: a survey of low memory algorithms
ACSC '12 Proceedings of the Thirty-fifth Australasian Computer Science Conference - Volume 122
Compressing IP forwarding tables: towards entropy bounds and beyond
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
On-line construction of position heaps
Journal of Discrete Algorithms
Binary jumbled string matching for highly run-length compressible texts
Information Processing Letters
Practical compression for multi-alignment genomic files
ACSC '13 Proceedings of the Thirty-Sixth Australasian Computer Science Conference - Volume 135
Spaces, Trees, and Colors: The algorithmic landscape of document retrieval on sequences
ACM Computing Surveys (CSUR)
On the combinatorics of suffix arrays
Information Processing Letters
Generalized biwords for bitext compression and translation spotting: extended abstract
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
On the value of multiple read/write streams for data compression
Information Theory, Combinatorics, and Search Theory
Indexing Word Sequences for Ranked Retrieval
ACM Transactions on Information Systems (TOIS)
A new compression scheme for secure transmission
International Journal of Automation and Computing
On compressing permutations and adaptive sorting
Theoretical Computer Science
Compressed property suffix trees
Information and Computation
Algorithms for computing Abelian periods of words
Discrete Applied Mathematics
Tight and simple Web graph compression for forward and reverse neighbor queries
Discrete Applied Mathematics
Faster semi-external suffix sorting
Information Processing Letters
Multi-pattern matching with bidirectional indexes
Journal of Discrete Algorithms
Cross-document pattern matching
Journal of Discrete Algorithms
Journal of Discrete Algorithms
Hi-index | 0.02 |
Full-text indexes provide fast substring search over large text collections. A serious problem of these indexes has traditionally been their space consumption. A recent trend is to develop indexes that exploit the compressibility of the text, so that their size is a function of the compressed text length. This concept has evolved into self-indexes, which in addition contain enough information to reproduce any text portion, so they replace the text. The exciting possibility of an index that takes space close to that of the compressed text, replaces it, and in addition provides fast search over it, has triggered a wealth of activity and produced surprising results in a very short time, which radically changed the status of this area in less than 5 years. The most successful indexes nowadays are able to obtain almost optimal space and search time simultaneously. In this article we present the main concepts underlying (compressed) self-indexes. We explain the relationship between text entropy and regularities that show up in index structures and permit compressing them. Then we cover the most relevant self-indexes, focusing on how they exploit text compressibility to achieve compact structures that can efficiently solve various search problems. Our aim is to give the background to understand and follow the developments in this area.