A Generalized Model for Predictive Data Mining

Authors:
James V. Hansen;James B. McDonald
Affiliations:
Marriott School of Management, Brigham Young University, Provo, Utah 84602, USA. james_hansen@byu.edu;Department of Economics, Brigham Young University, Provo, Utah 84602, USA
Venue:
Information Systems Frontiers
Year:
2002

Citing 3
Cited 0

An introduction to computational learning theory

An introduction to computational learning theory
Fundamentals of Artificial Neural Networks

Fundamentals of Artificial Neural Networks
Statistical Themes and Lessons for Data Mining

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a flexible model for predictive data mining, EGB2, which optimizes over a parameter space to fit data to a family of models based on maximum-likelihood criteria. It is also shown how EGB2 can integrate asymmetric costs of Type I and Type II errors, thereby minimizing expected misclassification costs.Importantly, it has been shown that standard methods of computing maximum-likelihood estimators are generally inconsistent when applied to sample data having different proportions of labels than are found in the universe from which the sample is drawn. We show how a choice estimator based on weighting each observation's contribution to the log-likelihood function, can contribute to estimator consistency and how this feature can be implemented in EGB2.