Using extended feature objects for partial similarity retrieval

  • Authors:
  • Stefan Berchtold;Daniel A. Keim;Hans-Peter Kriegel

  • Affiliations:
  • Institute for Computer Science, University of Munich, Oettingenstr. 67, D-80538 Munich, Germany/ {berchtol,keim,kriegel}@dbs.informatik.uni-muenchen.de;Institute for Computer Science, University of Munich, Oettingenstr. 67, D-80538 Munich, Germany/ {berchtol,keim,kriegel}@dbs.informatik.uni-muenchen.de;Institute for Computer Science, University of Munich, Oettingenstr. 67, D-80538 Munich, Germany/ {berchtol,keim,kriegel}@dbs.informatik.uni-muenchen.de

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases
  • Year:
  • 1997

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper, we introduce the concept of extended feature objects for similarity retrieval. Conventional approaches for similarity search in databases map each object in the database to a point in some high-dimensional feature space and define similarity as some distance measure in this space. For many similarity search problems, this feature-based approach is not sufficient. When retrieving partially similar polygons, for example, the search cannot be restricted to edge sequences, since similar polygon sections may start and end anywhere on the edges of the polygons. In general, inherently continuous problems such as the partial similarity search cannot be solved by using point objects in feature space. In our solution, we therefore introduce extended feature objects consisting of an infinite set of feature points. For an efficient storage and retrieval of the extended feature objects, we determine the minimal bounding boxes of the feature objects in multidimensional space and store these boxes using a spatial access structure. In our concrete polygon problem, sets of polygon sections are mapped to 2D feature objects in high-dimensional space which are then approximated by minimal bounding boxes and stored in an R $^*$-tree. The selectivity of the index is improved by using an adaptive decomposition of very large feature objects and a dynamic joining of small feature objects. For the polygon problem, translation, rotation, and scaling invariance is achieved by using the Fourier-transformed curvature of the normalized polygon sections. In contrast to vertex-based algorithms, our algorithm guarantees that no false dismissals may occur and additionally provides fast search times for realistic database sizes. We evaluate our method using real polygon data of a supplier for the car manufacturing industry.