Archival image indexing with connectivity features using randomized masks

  • Authors:
  • Arindam Biswas;Partha Bhowmick;Bhargab B. Bhattacharya

  • Affiliations:
  • Computer Science and Technology Department, Bengal Engineering and Science University, Howrah 711 103, India;Computer Science and Technology Department, Bengal Engineering and Science University, Howrah 711 103, India;Center for Soft Computing Research, Indian Statistical Institute, Kolkata 700 108, India

  • Venue:
  • Applied Soft Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Connectivity properties capture a natural spatial feature of a binary image. Albeit they are easy to compute, more often than not, they alone fail to provide a good characterization of the image because of the fact that several different images may have the same connectivity features. Typically, other types of parameters, e.g., moments, are used to augment the connectivity feature for efficient indexing and retrieval purposes. In this work, an alternative approach is proposed. Instead of considering other diverse features, which are computationally intensive, only the connectivity features of the image are used iteratively using a novel concept of spatial masking. For this purpose, a greedy algorithm for constructing a spatial feature vector of variable length for a binary image, is proposed. The algorithm is based on XOR-ing the image bit-plane with a few pseudo-random synthetic masks, and its novelty lies in computing the feature vector iteratively, depending on the size and diversity of the image database. The classical Euler number and the two primary connectivity features from which it is derived, namely, the number of connected components and the number of holes, are used to finally generate a unique feature vector for each binary image in the database using a fuzzy membership function customized for the given database. The method is particularly suitable for large-sized image archives of a digital library, where each image contains one or more objects. It is found to converge within only three iterations for a postal stamp database consisting of 2598 images, and also for a logo database of 1034 images. A data structure called discrimination tree has been introduced for supporting efficient storage and indexing of the images using the above feature vector.