A general framework for dimensionality reduction for large data sets

  • Authors:
  • Barbara Hammer;Michael Biehl;Kerstin Bunte;Bassam Mokbel

  • Affiliations:
  • CITEC centre of excellence, Bielefeld University, Bielefeld - Germany;University of Groningen - Johann Bernoulli Institute for Mathematics and Computer Science, Groningen - The Netherlands;University of Groningen - Johann Bernoulli Institute for Mathematics and Computer Science, Groningen - The Netherlands;CITEC centre of excellence, Bielefeld University, Bielefeld - Germany

  • Venue:
  • WSOM'11 Proceedings of the 8th international conference on Advances in self-organizing maps
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

With electronic data increasing dramatically in almost all areas of research, a plethora of new techniques for automatic dimensionality reduction and data visualization has become available in recent years. These offer an interface which allows humans to rapidly scan through large volumes of data. With data sets becoming larger and larger, however, the standard methods can no longer be applied directly. Random subsampling or prior clustering still being one of the most popular solutions in this case, we discuss a principled alternative and formalize the approaches under a general perspectives of dimensionality reduction as cost optimization. We have a first look at the question whether these techniques can be accompanied by theoretical guarantees.