An efficient algorithm for generating super condensed neighborhoods

Authors:
Luís M. S. Russo;Arlindo L. Oliveira
Affiliations:
IST / INESC-ID, Lisboa, Portugal;IST / INESC-ID, Lisboa, Portugal
Venue:
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Year:
2005

Citing 7
Cited 4

A new approach to text searching

Communications of the ACM
Fast text searching: allowing errors

Communications of the ACM
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A guided tour to approximate string matching

ACM Computing Surveys (CSUR)
Text-Retrieval: Theory and Practice

Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture - Information Processing '92, Volume 1 - Volume I
Approximate String-Matching over Suffix Trees

CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
A Fast Bit-Vector Algorithm for Approximate String Matching Based on Dynamic Programming

CPM '98 Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching

Efficient generation of super condensed neighborhoods

Journal of Discrete Algorithms
Indexing methods for approximate dictionary searching: Comparative analysis

Journal of Experimental Algorithmics (JEA)
Faster generation of super condensed neighbourhoods using finite automata

SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Flexible and efficient string similarity search with alignment-space transform

Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Indexing methods for the approximate string matching problem spend a considerable effort generating condensed neighborhoods. Here, we point out that condensed neighborhoods are not a minimal representation of a pattern neighborhood. We show that we can restrict our attention to super condensed neighborhoods which are minimal. We then present an algorithm for generating Super Condensed Neighborhoods. The algorithm runs in O(m⌈ m / w ⌉ s), where m is the pattern size, s is the size of the super condensed neighborhood and w the size of the processor word. Previous algorithms took O(m⌈ m / w ⌉ c) time, where c is the size of the condensed neighborhood. We further improve this algorithm by using Bit-Parallelism and Increased Bit-Parallelism techniques. Our experimental results show that the resulting algorithm is very fast.