Parameter Selection for Principal Curves

  • Authors:
  • G. Biau;A. Fischer

  • Affiliations:
  • Université Pierre et Marie Curie—Paris VI, Paris, France;-

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2012

Quantified Score

Hi-index 754.84

Visualization

Abstract

Principal curves are nonlinear generalizations of the notion of first principal component. Roughly, a principal curve is a parameterized curve in ${BBR}^d$ which passes through the “middle” of a data cloud drawn from some unknown probability distribution. Depending on the definition, a principal curve relies on some unknown parameters (number of segments, length, turn, etc.) which have to be properly chosen to recover the shape of the data without interpolating. In this paper, we consider the principal curve problem from an empirical risk minimization perspective and address the parameter selection issue using the point of view of model selection via penalization. We offer oracle inequalities and implement the proposed approach to recover the hidden structures in both simulated and real-life data.