Subscriber classification within telecom networks utilizing big data technologies and machine learning

  • Authors:
  • Jonathan Magnusson;Tor Kvernvik

  • Affiliations:
  • Uppsala University, Uppsala, Sweden;Ericsson Research, Stockholm, Sweden

  • Venue:
  • Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a scalable solution for identifying influential subscribers in for example telecom networks. The solution estimates one weighted value of influence out of several Social Network Analysis(SNA) metrics. The novel method for aggregation of several metrics utilizes machine learning to train models. A prototype solution has been implemented on a Hadoop platform to support scalability and to reduce hard ware cost by enabling the usage of commodity computers. The SNA algorithms have been adapted to efficiently execute on the MapReduce distributed platform. The prototype solution has been tested on a Hadoop cluster. The tests have verified that the solution can scale to support networks with millions of subscribers. Both real data from a telecom network operator with 2.4 million subscribers and synthetic data for networks up to 100 million subscribers have been used to verify the scalability and accuracy of the solution. The correlation between metrics have been analyzed to identify the information gain from each metric.