Comparing Large Datasets Structures through Unsupervised Learning

  • Authors:
  • Guénaël Cabanes;Younès Bennani

  • Affiliations:
  • LIPN-CNRS, UMR 7030, Université de Paris 13, Villetaneuse, France 93430;LIPN-CNRS, UMR 7030, Université de Paris 13, Villetaneuse, France 93430

  • Venue:
  • ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part I
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In data mining, the problem of measuring similarities between different subsets is an important issue which has been little investigated up to now. In this paper, a novel method is proposed based on unsupervised learning. Different subsets of a dataset are characterized by means of a model which implicitly corresponds to a set of prototypes, each one capturing a different modality of the data. Then, structural differences between two subsets are reflected in the corresponding model. Differences between models are detected using a similarity measure based on data density. Experiments over synthetic and real datasets illustrate the effectiveness, efficiency, and insights provided by our approach.