Zero-Inflated Boosted Ensembles for Rare Event Counts

  • Authors:
  • Alexander Borisov;George Runger;Eugene Tuv;Nuttha Lurponglukana-Strand

  • Affiliations:
  • Intel, Chandler;Industrial and Systems Engineering, Arizona State University, Tempe;Intel, Chandler;Industrial and Systems Engineering, Arizona State University, Tempe

  • Venue:
  • IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Two linked ensembles are used for a supervised learning problem with rare-event counts. With many target instances of zero, more traditional loss functions (such as squared error and class error) are often not relevant and a statistical model leads to a likelihood with two related parameters from a zero-inflated Poisson (ZIP) distribution. In a new approach, a linked pair of gradient boosted tree ensembles are developed to handle the multiple parameters in a manner that can be generalized to other problems. The result is a unique learner that extends machine learning methods to data with nontraditional structures. We empirically compare to two real data sets and two artificial data sets versus a single-tree approach (ZIP-tree) and a statistical generalized linear model.