Enhancing trie-based syntactic pattern recognition using AI heuristic search strategies

  • Authors:
  • Ghada Badr;B. John Oommen

  • Affiliations:
  • Ph.D student, School of Computer Science, Carleton University, Ottawa, Canada;Fellow of the IEEE, School of Computer Science, Carleton University, Ottawa, Canada

  • Venue:
  • ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
  • Year:
  • 2005

Quantified Score

Hi-index 0.02

Visualization

Abstract

This paper [5] deals with the problem of estimating, using enhanced AI techniques, a transmitted string X* by processing the corresponding string Y, which is a noisy version of X*. We assume that Y contains substitution, insertion and deletion errors, and that X* is an element of a finite (possibly large) dictionary, H. The best estimate X+ of X* is defined as that element of H which minimizes the Generalized Levenshtein Distance D(X, Y) between X and Y, for all X ∈ H. In this paper, we show how we can evaluate D(X, Y) for every X ∈ H simultaneously, when the edit distances are general and the maximum number of errors is not given a priori, and when H is stored as a trie. We first introduce a new scheme, Clustered Beam Search (CBS), a heuristic-based search approach that enhances the well known Beam Search (BS) techniques [33] contained in Artificial Intelligence (AI). It builds on BS with respect to the pruning time. The new technique is compared with the Depth First Search (DFS) trie-based technique [36] (with respect to time and accuracy) using large and small dictionaries. The results demonstrate a marked improvement up to (75%) with respect to the total number of operations needed on three benchmark dictionaries, while yielding an accuracy comparable to the optimal. Experiments are also done to show the benefits of the CBS over the BS when the search is done on the trie. The results also demonstrate a marked improvement (more than 91%) for large dictionaries.