Subscriber classification within telecom networks utilizing big data technologies and machine learning

Authors:
Jonathan Magnusson;Tor Kvernvik
Affiliations:
Uppsala University, Uppsala, Sweden;Ericsson Research, Stockholm, Sweden
Venue:
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Year:
2012

Citing 13
Cited 0

The perception: a probabilistic model for information storage and organization in the brain

Neurocomputing: foundations of research
C4.5: programs for machine learning

C4.5: programs for machine learning
Large margin classification using the perceptron algorithm

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Machine Learning

Machine Learning
Induction of Decision Trees

Machine Learning
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pregel: a system for large-scale graph processing - "ABSTRACT"

Proceedings of the 28th ACM symposium on Principles of distributed computing
Graph Twiddling in a MapReduce World

Computing in Science and Engineering
User position measures in social networks

Proceedings of the 3rd Workshop on Social Network Mining and Analysis
Design patterns for efficient graph algorithms in MapReduce

Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Spark: cluster computing with working sets

HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
HAMA: An Efficient Matrix Computation with the MapReduce Framework

CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
DisNet: A Framework for Distributed Graph Computation

ASONAM '11 Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a scalable solution for identifying influential subscribers in for example telecom networks. The solution estimates one weighted value of influence out of several Social Network Analysis(SNA) metrics. The novel method for aggregation of several metrics utilizes machine learning to train models. A prototype solution has been implemented on a Hadoop platform to support scalability and to reduce hard ware cost by enabling the usage of commodity computers. The SNA algorithms have been adapted to efficiently execute on the MapReduce distributed platform. The prototype solution has been tested on a Hadoop cluster. The tests have verified that the solution can scale to support networks with millions of subscribers. Both real data from a telecom network operator with 2.4 million subscribers and synthetic data for networks up to 100 million subscribers have been used to verify the scalability and accuracy of the solution. The correlation between metrics have been analyzed to identify the information gain from each metric.