An incremental nested partition method for data clustering

  • Authors:
  • Jyrko Correa-Morris;Dustin L. Espinosa-Isidrón;Denis R. Álvarez-Nadiozhin

  • Affiliations:
  • Mathematic Department, Faculty of Mathematic and Computer Sciences, Havana University, Cuba;Pattern Recognition Department, Advanced Technologies Application Center, Havana, Cuba;Mathematic Department, Faculty of Mathematic and Computer Sciences, Havana University, Cuba

  • Venue:
  • Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Clustering methods are a powerful tool for discovering patterns in a given data set through an organization of data into subsets of objects that share common features. Motivated by the independent use of some different partitions criteria and the theoretical and empirical analysis of some of its properties, in this paper, we introduce an incremental nested partition method which combines these partitions criteria for finding the inner structure of static and dynamic datasets. For this, we proved that there are relationships of nesting between partitions obtained, respectively, from these partition criteria, and besides that the sensitivity when a new object arrives to the dataset is rigorously studied. Our algorithm exploits all of these mathematical properties for obtaining the hierarchy of clusterings. Moreover, we realize a theoretical and experimental comparative study of our method with classical hierarchical clustering methods such as single-link and complete-link and other more recently introduced methods. The experimental results over databases of UCI repository and the AFP and TDT2 news collections show the usefulness and capability of our method to reveal different levels of information hidden in datasets.