Theoretical Computer Science
New indices for text: PAT Trees and PAT arrays
Information retrieval
Fast text searching: allowing errors
Communications of the ACM
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Block addressing indices for approximate text retrieval
Journal of the American Society for Information Science - Special topic issue: When museum informatics meets the World Wide Web
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Combinatorial Algorithms on Words
Combinatorial Algorithms on Words
Text-Retrieval: Theory and Practice
Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture - Information Processing '92, Volume 1 - Volume I
Efficient Implementation of Lazy Suffix Trees
WAE '99 Proceedings of the 3rd International Workshop on Algorithm Engineering
Approximate String-Matching over Suffix Trees
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Approximate String Matching and Local Similarity
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Filtration with q-Samples in Approximate String Matching
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
A New Indexing Method for Approximate String Matching
CPM '99 Proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching
Overcoming the Memory Bottleneck in Suffix Tree Construction
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
GLIMPSE: a tool to search through entire file systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Approximate String Matching in LDAP Based on Edit Distance
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Computing the Threshold for q-Gram Filters
SWAT '02 Proceedings of the 8th Scandinavian Workshop on Algorithm Theory
Accelerating Approximate Subsequence Search on Large Protein Sequence Databases
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
A hash trie filter method for approximate string matching in genomic databases
Applied Intelligence
Estimating the number of substring matches in long string databases
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Improved approximate string matching using compressed suffix data structures
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
The q-gram distance for ordered unlabeled trees
DS'05 Proceedings of the 8th international conference on Discovery Science
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Better Filtering with Gapped q-Grams
Fundamenta Informaticae - Computing Patterns in Strings
Hi-index | 0.00 |
We present a new index for approximate string matching. The index collects text q-samples, i.e. disjoint text substrings of length q, at fixed intervals and stores their positions. At search time, part of the text is filtered out by noticing that any occurrence of the pattern must be reflected in the presence of some text q-samples that match approximately inside the pattern. We show experimentally that the parameterization mechanism of the related filtration scheme provides a compromise between the space requirement of the index and the error level for which the filtration is still effcient.