Approximate string matching by combining automaton approach and binary neural networks

  • Authors:
  • Tomáš Beran;Miroslav Skrbek;Tomáš Macek

  • Affiliations:
  • Czech Technical University, Prague, Czech Republic;Czech Technical University, Prague, Czech Republic;IBM Czech Republic s.r.o., Chodov, Czech Republic

  • Venue:
  • ASC '07 Proceedings of The Eleventh IASTED International Conference on Artificial Intelligence and Soft Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article describes an approximate string matching method based on Correlation Matrix Memories (CMMs). As the measure of similarity, we use the Damerau-Levenshtein string edit distance, which is suitable for typing errors. CMMs are type of binary neural networks. They are capable of both exact and approximate matching (based on the Hamming distance). While the substitution operation can be performed by the common recalling method of CMM, the other edit operations (insertion, deletion and transposition) require enhancement of the recalling method. We incorporated a simple automaton for each of these operations into the recalling process. The proposed method preserves the advantage of this type of neural network: its simplicity. To keep both simplicity and the recalling speed, we primarily focus on approximate matching allowing a single error. Besides the edit distance problem we proposed two methods that speeds up the recalling process of CMMs.