Information distances over clusters

  • Authors:
  • Maxime Houllier;Yuan Luo

  • Affiliations:
  • Computer Science and Engineering Department, Shanghai Jiao Tong University, The MOE-Microsoft Key Laboratory for Intelligence Computing and Intelligence Systems, Shanghai, China;Computer Science and Engineering Department, Shanghai Jiao Tong University, The MOE-Microsoft Key Laboratory for Intelligence Computing and Intelligence Systems, Shanghai, China

  • Venue:
  • ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the data bases grow bigger, the data mining becomes more and more important In order to explore and understand the data, the help of computers and data mining methods, like clustering, is a necessity This paper introduces some information theory based distances and how they can be used for clustering More precisely, we want to classify a finite set of discrete random variables This classification has to be based on the correlation between these random variables In the design of a clustering system, the choice of the notion of distance is crucial, some information distances and classification methods are provided We also show that in order to have distances over clusters, the variables that are functions of other variables have to be removed from the starting set The last part gives some applications run on Matlab.