Principal Curve Algorithms for Partitioning High-Dimensional Data Spaces

  • Authors:
  • Junping Zhang;Xiaodan Wang;U. Kruger;Fei-Yue Wang

  • Affiliations:
  • Shanghai Key Lab. of Intell. Inf. Process., Fudan Univ., Shanghai, China;-;-;-

  • Venue:
  • IEEE Transactions on Neural Networks
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most partitioning algorithms iteratively partition a space into cells that contain underlying linear or nonlinear structures using linear partitioning strategies. The compactness of each cell depends on how well the (locally) linear partitioning strategy approximates the intrinsic structure. To partition a compact structure for complex data in a nonlinear context, this paper proposes a nonlinear partition strategy. This is a principal curve tree (PC-tree), which is implemented iteratively. Given that a PC passes through the middle of the data distribution, it allows for partitioning based on the arc length of the PC. To enhance the partitioning of a given space, a residual version of the PC-tree algorithm is developed, denoted here as the principal component analysis tree (PCR-tree) algorithm. Because of its residual property, the PCR-tree can yield the intrinsic dimension of high-dimensional data. Comparisons presented in this paper confirm that the proposed PC-tree and PCR-tree approaches show a better performance than several other competing partitioning algorithms in terms of vector quantization error and nearest neighbor search. The comparison also shows that the proposed algorithms outperform competing linear methods in total average coverage which measures the nonlinear compactness of partitioning algorithms.