Searching patterns in digital image databases

  • Authors:
  • Fei Shi;Ahmad AlShibli

  • Affiliations:
  • Computer Science Department, Suffolk University, Boston, MA;Computer Science Department, Suffolk University, Boston, MA

  • Venue:
  • ASIAN'05 Proceedings of the 10th Asian Computing Science conference on Advances in computer science: data management on the web
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method for the multiple two-dimensional pattern matching problem and its application in image database systems. In this problem, we are given a set S = {T1, T2,. . ., TN} of two-dimensional matrices and another two-dimensional matrix P, called the pattern, and we want to find all occurrences of the pattern P in the set S. The main idea behind our method is to represent two-dimensional matrices with one-dimensional strings (called fingerprint strings or simply fingerprints) thus reducing the two-dimensional matrix matching problem into a onedimensional string matching problem. We use a data structure, called the generalized suffix array, as our index structure to organize the fingerprints of the set S. The construction of the index (including converting the matrices in the set S into fingerprint strings) takes O(M log n) time and the index occupies O(M) space, where M denotes the total number of elements in all matrices in S and n the width of the widest matrix in S. Once the index is available, a query for the occurrences of an m×m pattern in the set S can be answered in O(m2 + logM) time. The reduction of the two-dimensional matrix problem into a one-dimensional string problem, however, can introduce errors, called false matches. A false match occurs if the algorithm claims a "match" between the pattern P and some submatrix of some matrix in the set S while they are actually not equal. But as will be seen, the probability that a false match can occur is negligible. For instance, suppose our patterns are 512 × 512 images. Then the probability that a "match" that is claimed by our algorithm is a false one is less than 2.39 × 10-7.