Manifold denoising as preprocessing for finding natural representations of data

Authors:
Matthias Hein;Markus Maier
Affiliations:
Max Planck Institute for Biological Cybernetics, Tübingen, Germany;Max Planck Institute for Biological Cybernetics, Tübingen, Germany
Venue:
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Year:
2007

Citing 5
Cited 1

A signal processing approach to fair surface design

SIGGRAPH '95 Proceedings of the 22nd annual conference on Computer graphics and interactive techniques
GTM: the generative topographic mapping

Neural Computation
Laplacian Eigenmaps for dimensionality reduction and data representation

Neural Computation
Adaptation in Statistical Pattern Recognition Using Tangent Vectors

IEEE Transactions on Pattern Analysis and Machine Intelligence
From graphs to manifolds – weak and strong pointwise consistency of graph laplacians

COLT'05 Proceedings of the 18th annual conference on Learning Theory

2008 Special Issue: An axiomatic approach to intrinsic dimension of a dataset

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

A natural representation of data is given by the parameters which generated the data. If the space of parameters is continuous, then we can regard it as a manifold. In practice, we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. However, the data is usually corrupted by noise, which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising, which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover, we will show that using the denoising method as a preprocessing step, one can significantly improve the results of a semi-supervised learning algorithm.