A Generalized Model for Predictive Data Mining

  • Authors:
  • James V. Hansen;James B. McDonald

  • Affiliations:
  • Marriott School of Management, Brigham Young University, Provo, Utah 84602, USA. james_hansen@byu.edu;Department of Economics, Brigham Young University, Provo, Utah 84602, USA

  • Venue:
  • Information Systems Frontiers
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a flexible model for predictive data mining, EGB2, which optimizes over a parameter space to fit data to a family of models based on maximum-likelihood criteria. It is also shown how EGB2 can integrate asymmetric costs of Type I and Type II errors, thereby minimizing expected misclassification costs.Importantly, it has been shown that standard methods of computing maximum-likelihood estimators are generally inconsistent when applied to sample data having different proportions of labels than are found in the universe from which the sample is drawn. We show how a choice estimator based on weighting each observation's contribution to the log-likelihood function, can contribute to estimator consistency and how this feature can be implemented in EGB2.