The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors

  • Authors:
  • Christian Bohm;Alexey Pryakhin;Matthias Schubert

  • Affiliations:
  • University of Munich, Germany;University of Munich, Germany;University of Munich, Germany

  • Venue:
  • ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

In applications of biometric databases the typical task is to identify individuals according to features which are not exactly known. Reasons for this inexactness are varying measuring techniques or environmental circumstances. Since these circumstances are not necessarily the same when determining the features for different individuals, the exactness might strongly vary between the individuals as well as between the features. To identify individuals, similarity search on feature vectors is applicable, but even the use of adaptable distance measures is not capable to handle objects having an individual level of exactness. Therefore, we develop a comprehensive probabilistic theory in which uncertain observations are modeled by probabilistic feature vectors (pfv), i.e. feature vectors where the conventional feature values are replaced by Gaussian probability distribution functions. Each feature value of each object is complemented by a variance value indicating its uncertainty. We define two types of identification queries, k-mostlikely identification and threshold identification. For efficient query processing, we propose a novel index structure, the Gauss-tree. Our experimental evaluation demonstrates that pfv stored in a Gauss-tree significantly improve the result quality compared to traditional feature vectors. Additionally, we show that the Gauss-tree significantly speeds up query times compared to competitive methods.