Density-based similarity measures for content based search

  • Authors:
  • Reid Porter;Christy Ruggiero;Don Hush

  • Affiliations:
  • Los Alamos National Laboratory, Los Alamos, NM;Los Alamos National Laboratory, Los Alamos, NM;Los Alamos National Laboratory, Los Alamos, NM

  • Venue:
  • Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the query by multiple example problem where the goal is to identify database samples whose content is similar to a collection of query samples. To assess the similarity we use a relative content density which quantifies the relative concentration of the query distribution to the database distribution. If the database distribution is a mixture of the query distribution and a background distribution then it can be shown that database samples whose relative content density is greater than a particular threshold ρ are more likely to have been generated by the query distribution than the background distribution. We describe an algorithm for predicting samples with relative content density greater than ρ that is computationally efficient and possesses strong performance guarantees. We also show empirical results for applications in computer network monitoring and image segmentation.