Fast string matching with k-differences
Journal of Computer and System Sciences - 26th IEEE Conference on Foundations of Computer Science, October 21-23, 1985
An improved algorithm for approximate string matching
SIAM Journal on Computing
A new approach to text searching
Communications of the ACM
Fast text searching: allowing errors
Communications of the ACM
Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Approximate string matching using within-word parallelism
Software—Practice & Experience
The String-to-String Correction Problem
Journal of the ACM (JACM)
Theoretical and Empirical Comparisons of Approximate String Matching Algorithms
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
Approximate String-Matching over Suffix Trees
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Filtration with q-Samples in Approximate String Matching
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
A Faster Algorithm for Approximate String Matching
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Fast and flexible string matching by combining bit-parallelism and suffix automata
Journal of Experimental Algorithmics (JEA)
SOFSEM '00 Proceedings of the 27th Conference on Current Trends in Theory and Practice of Informatics
The Max-Shift Algorithm for Approximate String Matching
WAE '01 Proceedings of the 5th International Workshop on Algorithm Engineering
Fast Implementations of Automata Computations
CIAA '00 Revised Papers from the 5th International Conference on Implementation and Application of Automata
Cascade Decompositions are Bit-Vector Algorithms
CIAA '01 Revised Papers from the 6th International Conference on Implementation and Application of Automata
Better Filtering with Gapped q-Grams
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Faster Bit-Parallel Approximate String Matching
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Approximate pattern matching and transitive closure logics
Theoretical Computer Science
A bit-vector algorithm for computing Levenshtein and Damerau edit distances
Nordic Journal of Computing - Special issue: Selected papers of the Prague Stringology conference (PSC'02), September 23-24, 2002
Fast multipattern search algorithms for intrusion detection
Fundamenta Informaticae - Special issue on computing patterns in strings
Speeding-up Hirschberg and Hunt-Szymanski LCS algorithms
Fundamenta Informaticae - Special issue on computing patterns in strings
Approximate string matching on Ziv-Lempel compressed text
Journal of Discrete Algorithms
From cascade decompositions to bit-vector algorithms
Theoretical Computer Science - Implementation and application automata
Average-optimal single and multiple approximate string matching
Journal of Experimental Algorithmics (JEA)
Bases of Motifs for Generating Repeated Patterns with Wild Cards
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Increased bit-parallelism for approximate and multiple string matching
Journal of Experimental Algorithmics (JEA)
Engineering efficient metric indexes
Pattern Recognition Letters
A programmable array processor architecture for flexible approximate string matching algorithms
Journal of Parallel and Distributed Computing
Efficient String Matching in Huffman Compressed Texts
Fundamenta Informaticae
On-line Approximate String Matching in Natural Language
Fundamenta Informaticae
High-error approximate dictionary search using estimate hash comparisons
Software—Practice & Experience
Bit-parallel string matching under Hamming distance in O(n⌈m/w⌉) worst case time
Information Processing Letters
Homology search with binary and trinary scoring matrices
International Journal of Bioinformatics Research and Applications
Processor array architectures for flexible approximate string matching
Journal of Systems Architecture: the EUROMICRO Journal
Improving the bit-parallel NFA of Baeza-Yates and Navarro for approximate string matching
Information Processing Letters
Ed-Join: an efficient algorithm for similarity joins with edit distance constraints
Proceedings of the VLDB Endowment
Simple-regular expressions and languages
Journal of Automata, Languages and Combinatorics
Fast and compact regular expression matching
Theoretical Computer Science
Indexed Hierarchical Approximate String Matching
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Nested Counters in Bit-Parallel String Matching
LATA '09 Proceedings of the 3rd International Conference on Language and Automata Theory and Applications
BAC Overlap Identification Based on Bit-Vectors
IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part I: Bio-Inspired Systems: Computational and Ambient Intelligence
Identification of design motifs with pattern matching algorithms
Information and Software Technology
Average-optimal multiple approximate string matching
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
An efficient algorithm for finding gene-specific probes for DNA microarrays
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Segmentation and annotation of audiovisual recordings based on automated speech recognition
IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Tuning approximate Boyer-Moore for gene sequences
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Approximate string matching with Lempel-Ziv compressed indexes
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Data analysis and bioinformatics
PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
A hash trie filter method for approximate string matching in genomic databases
Applied Intelligence
Approximate all-pairs suffix/prefix overlaps
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Indexing methods for approximate dictionary searching: Comparative analysis
Journal of Experimental Algorithmics (JEA)
Seed-set construction by equi-entropy partitioning for efficient and sensitive short-read mapping
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Fast bit-vector algorithms for approximate string matching under indel distance
SOFSEM'05 Proceedings of the 31st international conference on Theory and Practice of Computer Science
New algorithms for regular expression matching
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I
Using gap-insensitive string kernel to detect masquerading
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
On bit-parallel processing of multi-byte text
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Efficient q-gram filters for finding all ε-matches over a given length
RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
New bit-parallel indel-distance algorithm
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Approximate all-pairs suffix/prefix overlaps
Information and Computation
A fast bit-parallel algorithm for gapped string kernels
ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
A parallel algorithm for fixed-length approximate string-matching with k-mismatches
Algorithms and Applications
Fast and cache-oblivious dynamic programming with local dependencies
LATA'12 Proceedings of the 6th international conference on Language and Automata Theory and Applications
Efficient similarity search in very large string sets
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
On-line Approximate String Matching in Natural Language
Fundamenta Informaticae
Fast Multipattern Search Algorithms for Intrusion Detection
Fundamenta Informaticae - Computing Patterns in Strings
Speeding-up Hirschberg and Hunt-Szymanski LCS Algorithms
Fundamenta Informaticae - Computing Patterns in Strings
Efficient String Matching in Huffman Compressed Texts
Fundamenta Informaticae
WHAM: A High-Throughput Sequence Alignment Method
ACM Transactions on Database Systems (TODS)
Parallel processing for stepwise generalisation method on multi-core PC cluster
International Journal of Knowledge and Web Intelligence
Efficient high-similarity string comparison: the waterfall algorithm
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Scalable string similarity search/join with approximate seeds and multiple backtracking
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Efficient fuzzy search in large text collections
ACM Transactions on Information Systems (TOIS)
Evaluating the acceleration of typical scientific problems on the GPU
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Fast Longest Common Subsequence with General Integer Scoring Support on GPUs
Proceedings of Programming Models and Applications on Multicores and Manycores
Hi-index | 0.01 |
The approximate string matching problem is to find all locations at which a query of lengthm matches a substring of a text of length n with k-or-fewer differences. Simple and practical bit-vector algorithms have been designed for this problem, most notably the one used in agrep. These algorithms compute a bit representation of the current state-set of the k-difference automaton for the query, and asymptotically run in either O(nm/w) or O(nm log &sgr;/w) time where w is the word size of the machine (e.g., 32 or 64 in practice), and &sgr; is the size of the pattern alphabet. Here we present an algorithm of comparable simplicity that requires only O(nm/w) time by virtue of computing a bit representation of the relocatable dynamic programming matrix for the problem. Thus, the algorithm's performance is independent of k, and it is found to be more efficient than the previous results for many choices of k and smallm. Moreover, because the algorithm is not dependent on k, it can be used to rapidly compute blocks of the dynamic programming matrix as in the 4-Russians algorithm of Wu et al.(1996). This gives rise to an O(kn/w) expected-time algorithm for the case where m may be arbitrarily large. In practice this new algorithm, that computes a region of the dynamic progr amming (d.p.) matrx w entries at a time using the basic algorithm as a subroutine is significantly faster than our previous 4-Russians algorithm, that computes the same region 4 or 5 entries at a time using table lookup. This performance improvement yields a code that is either superior or competitive with all existing algorithms except for some filtration algorithms that are superior when k/m is sufficiently small.