Combining a new data classification technique and regression analysis to predict the Cost-To-Serve new customers

  • Authors:
  • Estelle R. S. Kone;Mark H. Karwan

  • Affiliations:
  • Department of Industrial and Systems Engineering, 438 Bell Hall, University at Buffalo (SUNY), Buffalo, NY 14260, United States;Operations Research, Department of Industrial and Systems Engineering, 438 Bell Hall, University at Buffalo (SUNY), Buffalo, NY 14260, United States

  • Venue:
  • Computers and Industrial Engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identifying the Cost-To-Serve (CTS) of customers is one of the most challenging problems in Supply Chain Management because of the diversity in their business activities. For the particular case of the industrial gas business, we are interested in predicting the cost to deliver bulk (liquefied) gas to new customers using a multifactor linear regression model. Developing a single model, i.e. analyzing the observations all at once, produces poor prediction results. Therefore prior to the regression analysis, a new supervised learning technique is used to group customers who are similar in some sense. Classes of customers are represented by hyper-boxes and a linear regression model is subsequently built within each class. The combination of data classification and regression is proven to increase the accuracy of the prediction. Two Mixed-Integer-Linear Programming (MILP) models are developed for data classification purposes. Although we are dealing with a supervised learning method, classes are not predefined in our case. Rather, we input a continuous ''classification'' attribute that is optimally discretized by the MILP's in order to minimize the number of misclassifications. Therefore our data classification model offers a broader range of applications. A number of illustrative examples are used to prove the effectiveness of the proposed approach.