Using genetic K-means algorithm for PCA regression data in customer churn prediction

  • Authors:
  • Bingquan Huang;T. Satoh;Y. Huang;M.-T. Kechadi;B. Buckley

  • Affiliations:
  • School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland

  • Venue:
  • ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Imbalance distribution of samples between churners and nonchurners can hugely affect churn prediction results in telecommunication services field. One method to solve this is over-sampling approach by PCA regression. However, PCA regression may not generate good churn samples if a dataset is nonlinear discriminant. We employed Genetic K-means Algorithm to cluster a dataset to find locally optimum small dataset to overcome the problem. The experiments were carried out on a real-world telecommunication dataset and assessed on a churn prediction task. The experiments showed that Genetic K-means Algorithm can improve prediction results for PCA regression and performed as good as SMOTE.