A theoretical investigation of several model selection criteria for dimensionality reduction

Authors:
Shikui Tu;Lei Xu
Affiliations:
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, PR China;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, PR China
Venue:
Pattern Recognition Letters
Year:
2012

Citing 13
Cited 0

Mixtures of probabilistic principal component analyzers

Neural Computation
A comparative investigation on subspace dimension determination

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Eigenvalues of large sample covariance matrices of spiked population models

Journal of Multivariate Analysis
Theoretical Analysis and Comparison of Several Criteria on Linear Model Dimension Reduction

ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
Nonparametric detection of signals by information theoretic criteria: performance analysis and an improved estimator

IEEE Transactions on Signal Processing
Dimensionality reduction by minimizing nearest-neighbor classification error

Pattern Recognition Letters
An empirical evaluation on dimensionality reduction schemes for dissimilarity-based classifications

Pattern Recognition Letters
Detection of signals by information theoretic criteria: generalasymptotic performance analysis

IEEE Transactions on Signal Processing
Sample Eigenvalue Based Detection of High-Dimensional Signals in White Noise Using Relatively Few Samples

IEEE Transactions on Signal Processing - Part I
Estimation of the Number of Sources in Unbalanced Arrays via Information Theoretic Criteria

IEEE Transactions on Signal Processing
Analysis of the performance and sensitivity ofeigendecomposition-based detectors

IEEE Transactions on Signal Processing
On the behavior of information theoretic criteria for model orderselection

IEEE Transactions on Signal Processing
Paper: Modeling by shortest data description

Automatica (Journal of IFAC)

Quantified Score

Hi-index	0.10

Visualization

Abstract

Based on the problem of determining the hidden dimensionality (or the number of latent factors) of Factor Analysis (FA) model, this paper provides a theoretic comparison on several classical model selection criteria, including Akaike's Information Criterion (AIC), Bozdogan's Consistent Akaike's Information Criterion (CAIC), Hannan-Quinn information criterion (HQC), Schwarz's Bayesian Information Criterion (BIC). We focus on building up a partial order of the relative underestimation tendency. The order is shown to be AIC, HQC, BIC, and CAIC, indicating the underestimation probabilities from small to large. This order indicates an order of model selection performances to great extent, because underestimations usually take the major proportion of wrong selections when the sample size and the population signal-to-noise ratio (SNR, defined as the ratio of the smallest variance of the hidden dimensions to the variance of noise) decrease. Synthetic experiments by varying the values of the SNR and the training sample size N verify the theoretical results.