An experimental study of an opportunistic index
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Biased dictionaries with fast insert/deletes
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Succinct representations of lcp information and improvements in the compressed suffix arrays
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Database indexing for large DNA and protein sequence collections
The VLDB Journal — The International Journal on Very Large Data Bases
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Space-Efficient Data Structures for Flexible Text Retrieval Systems
ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
Optimal Exact Strring Matching Based on Suffix Arrays
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Indexing Text Using the Ziv-Lempel Trie
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Biased Skip Lists for Highly Skewed Access Patterns
ALENEX '01 Revised Papers from the Third International Workshop on Algorithm Engineering and Experimentation
Trade Off Between Compression and Search Times in Compact Suffix Array
ALENEX '01 Revised Papers from the Third International Workshop on Algorithm Engineering and Experimentation
A Space and Time Efficient Algorithm for Constructing Compressed Suffix Arrays
COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Space-Economical Algorithms for Finding Maximal Unique Matches
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Engineering a Lightweight Suffix Array Construction Algorithm
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Time/Space Efficient Compressed Pattern Matching
FCT '01 Proceedings of the 13th International Symposium on Fundamentals of Computation Theory
A repetition based measure for verification of text collections and for text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Searching BWT Compressed Text with the Boyer-Moore Algorithm and Binary Search
DCC '02 Proceedings of the Data Compression Conference
Time/space efficient compressed pattern matching
Fundamenta Informaticae - Special issue on computing patterns in strings
Compact suffix array: a space-efficient full-text index
Fundamenta Informaticae - Special issue on computing patterns in strings
When indexing equals compression: experiments with compressing suffix arrays and applications
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Compression boosting in optimal linear time using the Burrows-Wheeler Transform
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
Indexing text using the Ziv-Lempel trie
Journal of Discrete Algorithms - SPIRE 2002
A linear lower bound on index size for text retrieval
Journal of Algorithms - Special issue: Twelfth annual ACM-SIAM symposium on discrete algorithms
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Dynamic dictionary matching and compressed suffix trees
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
A categorization theorem on suffix arrays with applications to space efficient text indexes
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Substring compression problems
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Journal of the ACM (JACM)
Structuring labeled trees for optimal succinctness, and beyond
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Rank/select operations on large alphabets: a tool for text indexing
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Approximate string matching using compressed suffix arrays
Theoretical Computer Science
Suffix arrays: what are they good for?
ADC '06 Proceedings of the 17th Australasian Database Conference - Volume 49
Succinct suffix arrays based on run-length encoding
Nordic Journal of Computing
ACM Computing Surveys (CSUR)
Compressed indexes for dynamic text collections
ACM Transactions on Algorithms (TALG)
Compressed indexes for approximate string matching
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
An efficient, versatile approach to suffix sorting
Journal of Experimental Algorithmics (JEA)
Journal of Discrete Algorithms
The SBC-tree: an index for run-length compressed sequences
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Dynamic entropy-compressed sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Counting suffix arrays and strings
Theoretical Computer Science
Implementing the LZ-index: Theory versus practice
Journal of Experimental Algorithmics (JEA)
A Simple and Compact Algorithm for the RMQ and Its Application to the Longest Common Repeat Problem
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
An Online Algorithm for Finding the Longest Previous Factors
ESA '08 Proceedings of the 16th annual European symposium on Algorithms
Succinct backward-DAWG-matching
Journal of Experimental Algorithmics (JEA)
Cell probe lower bounds for succinct data structures
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Simple Random Access Compression
Fundamenta Informaticae
Broadword Computing and Fibonacci Code Speed Up Compressed Suffix Arrays
SEA '09 Proceedings of the 8th International Symposium on Experimental Algorithms
Application of the Burrows-Wheeler Transform for searching for tandem repeats in DNA sequences
International Journal of Bioinformatics Research and Applications
A four-stage algorithm for updating a Burrows-Wheeler transform
Theoretical Computer Science
Dynamic rank/select structures with applications to run-length encoded texts
Theoretical Computer Science
Succinct Text Indexing with Wildcards
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
A Compressed Enhanced Suffix Array Supporting Fast String Matching
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Dynamic extended suffix arrays
Journal of Discrete Algorithms
Linear-time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Efficient construction of FM-index using overlapping block processing for large scale texts
ECIR'07 Proceedings of the 29th European conference on IR research
Implicit compression boosting with applications to self-indexing
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
An experimental study of compressed indexing and local alignments of DNA
COCOA'07 Proceedings of the 1st international conference on Combinatorial optimization and applications
Bidirectional search in a string with wavelet trees
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Succinct dictionary matching with no slowdown
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Compression, indexing, and retrieval for massive string data
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Indexing similar DNA sequences
AAIM'10 Proceedings of the 6th international conference on Algorithmic aspects in information and management
Practical approaches to reduce the space requirement of lempel-ziv--based compressed text indices
Journal of Experimental Algorithmics (JEA)
Design of an efficient out-of-core read alignment algorithm
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Computing matching statistics and maximal exact matches on compressed full-text indexes
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Position heaps: A simple and dynamic text indexing data structure
Journal of Discrete Algorithms
ACM Transactions on Algorithms (TALG)
Compressed string dictionaries
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Lempel-Ziv factorization revisited
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Succincter text indexing with wildcards
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Lightweight BWT construction for very large string collections
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Compressed directed acyclic word graph with application in local alignment
COCOON'11 Proceedings of the 17th annual international conference on Computing and combinatorics
Seed-set construction by equi-entropy partitioning for efficient and sensitive short-read mapping
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Alphabet-independent compressed text indexing
ESA'11 Proceedings of the 19th European conference on Algorithms
Computing the longest common prefix array based on the burrows-wheeler transform
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
A succinct index for hypertext
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
A linear size index for approximate pattern matching
Journal of Discrete Algorithms
Word-based self-indexes for natural language text
ACM Transactions on Information Systems (TOIS)
A linear size index for approximate pattern matching
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Statistical encoding of succinct data structures
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Dynamic entropy-compressed sequences and full-text indexes
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Workload-optimal histograms on streams
ESA'05 Proceedings of the 13th annual European conference on Algorithms
Inverted files versus suffix arrays for locating patterns in primary memory
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Rapid homology search with two-stage extension and daughter seeds
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Succinct text indexes on large alphabet
TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
A New Efficient Data Structure for Storage and Retrieval of Multiple Biosequences
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Space-efficient construction of LZ-index
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Succinct suffix arrays based on run-length encoding
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Time and space efficient search for small alphabets with suffix arrays
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
On the number of elements to reorder when updating a suffix array
Journal of Discrete Algorithms
Accelerating short read mapping on an FPGA (abstract only)
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Efficient implementation of rank and select functions for succinct representation
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Bidirectional search in a string with wavelet trees and bidirectional matching statistics
Information and Computation
Counting suffix arrays and strings
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Position-Restricted substring searching
LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Compression of RDF dictionaries
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Memory-Aware BWT by segmenting sequences to support subsequence search
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Space-efficient multiple string matching automata
International Journal of Wireless and Mobile Computing
Querying RDF dictionaries in compressed space
ACM SIGAPP Applied Computing Review
Efficient in-memory top-k document retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
ALAE: accelerating local alignment with affine gap exactly in biosequence databases
Proceedings of the VLDB Endowment
Full-text search on multi-byte encoded documents
Proceedings of the 2012 ACM symposium on Document engineering
Simple Random Access Compression
Fundamenta Informaticae
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Computing the burrows-wheeler transform of a string and its reverse
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Time/Space Efficient Compressed Pattern Matching
Fundamenta Informaticae - Computing Patterns in Strings
Compact Suffix Array — A Space-Efficient Full-Text Index
Fundamenta Informaticae - Computing Patterns in Strings
A randomized Numerical Aligner (rNA)
Journal of Computer and System Sciences
LRM-Trees: Compressed indices, adaptive sorting, and compressed permutations
Theoretical Computer Science
Compressed data structures with relevance
Proceedings of the 21st ACM international conference on Information and knowledge management
Efficient indexing algorithms for approximate pattern matching in text
Proceedings of the Seventeenth Australasian Document Computing Symposium
Comparing DNA sequence collections by direct comparison of compressed text indexes
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Space-Efficient computation of maximal and supermaximal repeats in genome sequences
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Smaller self-indexes for natural language
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Application of the burrows-wheeler transform for searching for approximate tandem repeats
PRIB'12 Proceedings of the 7th IAPR international conference on Pattern Recognition in Bioinformatics
Computing the longest common prefix array based on the Burrows-Wheeler transform
Journal of Discrete Algorithms
ESP-index: A compressed index based on edit-sensitive parsing
Journal of Discrete Algorithms
Journal of Discrete Algorithms
Parallel suffix array and least common prefix for the GPU
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Cache-aware parallel approximate matching and join algorithms using BWT
Proceedings of the Joint EDBT/ICDT 2013 Workshops
On compressing and indexing repetitive sequences
Theoretical Computer Science
Compressed indexes for text with wildcards
Theoretical Computer Science
Lightweight algorithms for constructing and inverting the BWT of string collections
Theoretical Computer Science
Compressing IP forwarding tables: towards entropy bounds and beyond
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM Approach
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Masher: Mapping Long(er) Reads with Hash-based Genome Indexing on GPUs
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Compressed persistent index for efficient rank/select queries
WADS'13 Proceedings of the 13th international conference on Algorithms and Data Structures
A Compressed Suffix Tree Based Implementation With Low Peak Memory Usage
Electronic Notes in Theoretical Computer Science (ENTCS)
Journal of Discrete Algorithms
Computing the Burrows-Wheeler transform of a string and its reverse in parallel
Journal of Discrete Algorithms
Hi-index | 0.01 |
We address the issue of compressing and indexing data. We devise a data structure whose space occupancy is a function of the entropy of the underlying data set. We call the data structure opportunistic since its space occupancy is decreased when the input is compressible and this space reduction is achieved at no significant slowdown in the query performance. More precisely, its space occupancy is optimal in an information-content sense because text T[1,u] is stored using O(H/sub k/(T))+o(1) bits per input symbol in the worst case, where H/sub k/(T) is the kth order empirical entropy of T (the bound holds for any fixed k). Given an arbitrary string P[1,p], the opportunistic data structure allows to search for the occurrences of P in T in O(p+occlog/sup /spl epsiv//u) time (for any fixed /spl epsiv/