Comparative investigation on dimension reduction and regression in three layer feed-forward neural network

Authors:
Lei Shi;Lei Xu
Affiliations:
Chinese University of Hong Kong, Shatin, NT, Hong Kong;Chinese University of Hong Kong, Shatin, NT, Hong Kong
Venue:
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Year:
2006

Citing 4
Cited 2

Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
A practical Bayesian framework for backpropagation networks

Neural Computation
Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Advances on BYY harmony learning: information theoretic perspective, generalized projection geometry, and independent factor autodetermination

IEEE Transactions on Neural Networks

A unified perspective and new results on RHT computing, mixture based learning, and multi-learner based problem solving

Pattern Recognition
Supervised feature selection for linear and non-linear regression of L*a*b* color from multispectral images of meat

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Three layer feed-forward neural network (3-LFFNN) has been widely used for nonlinear regression. It is well known that its hidden layer can be regarded as taking the role of feature extraction and dimension reduction, and that the regression performance relies on how the feature dimension or equivalently the number of hidden units is determined appropriately. There are many publications on determining the hidden unit number for a desired generalization error. However, few comparative studies have been made on different approaches proposed, especially on those typical model selection criteria for this purpose. This paper targets such an aim. Using both simulated data and several real world data sets, a comparative study has been made on the regression performances with the number of hidden units determined by several typical model selection criteria, including Akaike’s Information Criterion (AIC), the consistent Akaike’s information criterion (CAIC), Schwarz’s Bayesian Inference Criterion (BIC) which coincides with Rissanen’s Minimum Description Length (MDL) criterion, and the well-known technique cross-validation (CV), as well as the Bayesian Ying-Yang harmony criterion on a small sample size (BYY-S). As shown in experiments on a small size of samples, BIC and CV are better than AIC and CAIC obviously. Moreover, BIC may be better than CV on certain data sets, while CV may be better than BIC on other data sets. Interestingly, BYY-S generally outperforms both BIC and CV.