The method for solving two types of errors in customer segmentation on unbalanced data

  • Authors:
  • Zou Peng;Hao Yuanyuan

  • Affiliations:
  • Harbin Institute of Technology, P. R. China;Harbin Institute of Technology, P. R. China

  • Venue:
  • Proceedings of the 10th international conference on Electronic commerce
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Two types of errors of "rejecting true and accepting false" are inevitable in customer value segmentation. The traditional data mining method which is on the total accuracy rate can not reflect the influence caused by the great difference of misclassification costs and unbalanced quantity distribution of customers who have various values. The thesis proposes the cost-sensitive SVM (support vector machine) classifier by presenting misclassification cost function based on customer value, which is evaluated by the function of the exceptional lost. The data test result proves that the method can control the different types of errors distribution with various cost of misclassification accurately, reduce the total misclassification cost, and distinguish the customer value effectively.