Local PCA regression for missing data estimation in telecommunication dataset

  • Authors:
  • T. Sato;B. Q. Huang;Y. Huang;M.-T. Kechadi

  • Affiliations:
  • School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4, Ireland

  • Venue:
  • PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The customer churn problem affects hugely the telecommunication services in particular, and businesses in general. Note that in majority of cases the number of potential customer churn is much smaller than the non-churners. Therefore, the imbalance distribution of samples between churners and non-churners is a concern when building a churn prediction model. This paper presents a Local PCA approach to solve imbalance classification problem by generating new churn samples. The experiments were carried out on a large real-world Telecommunication dataset and assessed on a churn prediction task. The experiments showed that the Local PCA along with Smote outperformed Linear regression and Standard PCA data generation techniques.