Normalized kernels as similarity indices

  • Authors:
  • Julien Ah-Pine

  • Affiliations:
  • Xerox Research Centre Europe, Meylan, France

  • Venue:
  • PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Measuring similarity between objects is a fundamental issue for numerous applications in data-mining and machine learning domains In this paper, we are interested in kernels We particularly focus on kernel normalization methods that aim at designing proximity measures that better fit the definition and the intuition of a similarity index To this end, we introduce a new family of normalization techniques which extends the cosine normalization Our approach aims at refining the cosine measure between vectors in the feature space by considering another geometrical based score which is the mapped vectors' norm ratio We show that the designed normalized kernels satisfy the basic axioms of a similarity index unlike most unnormalized kernels Furthermore, we prove that the proposed normalized kernels are also kernels Finally, we assess these different similarity measures in the context of clustering tasks by using a kernel PCA based clustering approach Our experiments employing several real-world datasets show the potential benefits of normalized kernels over the cosine normalization and the Gaussian RBF kernel.