Approximate String Matching in LDAP Based on Edit Distance

  • Authors:
  • Chi-Chien Pan;Kai-Hsiang Yang;Tzao-Lin Lee

  • Affiliations:
  • -;-;-

  • Venue:
  • IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the E-Commerce rapidly grows up, searching data is almost necessary in every application. Approximate string matching problems play a very important role to search with errors. Against these problems "Edit distance" and "Soundex" are two common techniques, especially the latter one is a "sound-like" method and had been applied to the LDAP server. Nevertheless, it is not adequate for certain situations especially when we perform the symbol matching (as in DNA); it doesn't make sense to use the "sound-like" method. On the other hand, "Edit distance" has a clear definition and also is widely used in many fields of application. Since the design of LDAP server is optimized for reading, applying edit distance technique to LDAP server has the problem of lowering speed. In this paper we design efficient data structures and an algorithm to solve the speed problem, and furthermore we use three filter conditions [1] based on the n-gram technique to achieve a well filter performance. Finally we also demonstrate experimentally the benefits of applying our algorithm and its limitations.