Predicting customer retention and profitability by using random forests and regression forests techniques

  • Authors:
  • Bart Larivière;Dirk Van den Poel

  • Affiliations:
  • Department of Marketing, Ghent University, Hoveniersberg 24, 9000 Ghent, Belgium;Department of Marketing, Ghent University, Hoveniersberg 24, 9000 Ghent, Belgium

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2005

Quantified Score

Hi-index 12.09

Visualization

Abstract

In an era of strong customer relationship management (CRM) emphasis, firms strive to build valuable relationships with their existing customer base. In this study, we attempt to better understand three important measures of customer outcome: next buy, partial-defection and customers' profitability evolution. By means of random forests techniques we investigate a broad set of explanatory variables, including past customer behavior, observed customer heterogeneity and some typical variables related to intermediaries. We analyze a real-life sample of 100,000 customers taken from the data warehouse of a large European financial services company. Two types of random forests techniques are employed to analyze the data: random forests are used for binary classification, whereas regression forests are applied for the models with linear dependent variables. Our research findings demonstrate that both random forests techniques provide better fit for the estimation and validation sample compared to ordinary linear regression and logistic regression models. Furthermore, we find evidence that the same set of variables have a different impact on buying versus defection versus profitability behavior. Our findings suggest that past customer behavior is more important to generate repeat purchasing and favorable profitability evolutions, while the intermediary's role has a greater impact on the customers' defection proneness. Finally, our results demonstrate the benefits of analyzing different customer outcome variables simultaneously, since an extended investigation of the next buy-partial-defection-customer profitability triad indicates that one cannot fully understand a particular outcome without understanding the other related behavioral outcome variables.