Threefold versus fivefold cross-validation and individual versus average data in predictive regression modelling of machining experimental data

Authors:
C. -X. J. Feng;Z. -G. S. Yu;J. T. Emanuel;P. -G. Li;X. -Y. Shao;Z. -H. Wang
Affiliations:
Department of Industrial and Manufacturing Engineering, Bradley University, Peoria, Illinois, USA;Department of Industrial and Manufacturing Engineering, Bradley University, Peoria, Illinois, USA;Department of Industrial and Manufacturing Engineering, Bradley University, Peoria, Illinois, USA;School of Mechanical Science and Engineering, Huazhong University of Science and Technology Wuhan, Hubei, China;School of Mechanical Science and Engineering, Huazhong University of Science and Technology Wuhan, Hubei, China;School of Mechanical Science and Engineering, Huazhong University of Science and Technology Wuhan, Hubei, China
Venue:
International Journal of Computer Integrated Manufacturing
Year:
2008

Citing 8
Cited 0

Empirical model-building and response surface

Empirical model-building and response surface
Introduction to operations research, 4th ed.

Introduction to operations research, 4th ed.
More comments on Cp

Technometrics
The nature of mathematical modeling

The nature of mathematical modeling
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Machine Learning

Machine Learning
Introduction to Linear Regression Analysis, Solutions Manual (Wiley Series in Probability and Statistics)

Introduction to Linear Regression Analysis, Solutions Manual (Wiley Series in Probability and Statistics)
Bias and variance of validation methods for function approximationneural networks under conditions of sparse data

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Quantified Score

Hi-index	0.00

Visualization

Abstract

Model selection and validation are critical in predicting the performance of manufacturing processes. Proper selection of variables helps minimize the model mismatch error, proper selection of models helps reduce the model estimation error, and proper validation of models helps minimize the model prediction error. In the current paper, the literature is reviewed and a rigorous procedure is proposed for selection and cross-validation (CV) of predictive regression models. Experimental data from a turning surface roughness study are used to demonstrate how to select and validate predictive regression models. In particular, different data splitting methods are compared, such as fivefold CV versus threefold CV as well as the individual data versus the average data. This paper has revealed no statistical difference between the use of fivefold CV and threefold CV, and the use of the individual and the average data in subset selection and CV of predictive regression models. Consequently, threefold instead of fivefold or tenfold CV and either individual data or average data may be used to reduce the computational cost in predictive regression modelling of experimental data based on this and other similar empirical studies.