Similarity estimation using Bayes ensembles

  • Authors:
  • Tobias Emrich;Franz Graf;Hans-Peter Kriegel;Matthias Schubert;Marisa Thoma

  • Affiliations:
  • Ludwig-Maximilians-Universität München, Munich, Germany;Ludwig-Maximilians-Universität München, Munich, Germany;Ludwig-Maximilians-Universität München, Munich, Germany;Ludwig-Maximilians-Universität München, Munich, Germany;Ludwig-Maximilians-Universität München, Munich, Germany

  • Venue:
  • SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Similarity search and data mining often rely on distance or similarity functions in order to provide meaningful results and semantically meaningful patterns. However, standard distance measures likeLp-norms are often not capable to accurately mirror the expected similarity between two objects. To bridge the so-called semantic gap between feature representation and object similarity, the distance function has to be adjusted to the current application context or user. In this paper, we propose a new probabilistic framework for estimating a similarity value based on a Bayesian setting. In our framework, distance comparisons are modeled based on distribution functions on the difference vectors. To combine these functions, a similarity score is computed by an Ensemble of weak Bayesian learners for each dimension in the feature space. To find independent dimensions of maximum meaning, we apply a space transformation based on eigenvalue decomposition. In our experiments, we demonstrate that our new method shows promising results compared to related Mahalanobis learners on several test data sets w.r.t. nearest-neighbor classification and precision-recall-graphs.