Information distances over clusters

Authors:
Maxime Houllier;Yuan Luo
Affiliations:
Computer Science and Engineering Department, Shanghai Jiao Tong University, The MOE-Microsoft Key Laboratory for Intelligence Computing and Intelligence Systems, Shanghai, China;Computer Science and Engineering Department, Shanghai Jiao Tong University, The MOE-Microsoft Key Laboratory for Intelligence Computing and Intelligence Systems, Shanghai, China
Venue:
ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
Year:
2010

Citing 5
Cited 0

Algorithms for clustering data

Algorithms for clustering data
Estimation of entropy and mutual information

Neural Computation
Correlation Clustering

Machine Learning
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Information Theory and Network Coding

Information Theory and Network Coding

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the data bases grow bigger, the data mining becomes more and more important In order to explore and understand the data, the help of computers and data mining methods, like clustering, is a necessity This paper introduces some information theory based distances and how they can be used for clustering More precisely, we want to classify a finite set of discrete random variables This classification has to be based on the correlation between these random variables In the design of a clustering system, the choice of the notion of distance is crucial, some information distances and classification methods are provided We also show that in order to have distances over clusters, the variables that are functions of other variables have to be removed from the starting set The last part gives some applications run on Matlab.