A comparison of dimensionality reduction methods using topology preservation indexes

  • Authors:
  • Claudio J. F. de Medeiros;José Alfredo Ferreira Costa;Leandro A. Silva

  • Affiliations:
  • Departament of Electrical Engineering, Federal University, UFRN, Brazil;Departament of Electrical Engineering, Federal University, UFRN, Brazil;School of Computing and Informatics, Mackenzie Presbyterian University, S. Paulo, Brazil

  • Venue:
  • IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the remarkable technological developments experienced in recent decades, the vast amount of data had created new opportunities and challenges in the field of knowledge discovery and data mining. Factors like size and high dimensionality of databases adds difficulties to the complex task of discovering patterns hidden in masses of data. The feasibility of highdimensional data exploration depends on techniques known as dimensionality reduction methods. When class labels are available, an optimization function can be used to maximize intra class cohesion and inter class separation. However, in many practical situations information about class is not available. This paper focuses on unsupervised dimensionality reduction techniques, an important phase in exploratory data analysis. Six important methods are described: Principal components analysis, Sammon projection, Autoassociative Neural network, Kohonen maps, Isomap and Locally Linear Embedding. Three quality indexes are proposed to try to quantify to some degree the topology preservation between input and output spaces. Comparisons are performed using benchmark data sets. Results and tests focused two-dimensional projections for data visualization purposes.