The perception: a probabilistic model for information storage and organization in the brain
Neurocomputing: foundations of research
C4.5: programs for machine learning
C4.5: programs for machine learning
Large margin classification using the perceptron algorithm
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Machine Learning
Machine Learning
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pregel: a system for large-scale graph processing - "ABSTRACT"
Proceedings of the 28th ACM symposium on Principles of distributed computing
Graph Twiddling in a MapReduce World
Computing in Science and Engineering
User position measures in social networks
Proceedings of the 3rd Workshop on Social Network Mining and Analysis
Design patterns for efficient graph algorithms in MapReduce
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Spark: cluster computing with working sets
HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
HAMA: An Efficient Matrix Computation with the MapReduce Framework
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
DisNet: A Framework for Distributed Graph Computation
ASONAM '11 Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
This paper describes a scalable solution for identifying influential subscribers in for example telecom networks. The solution estimates one weighted value of influence out of several Social Network Analysis(SNA) metrics. The novel method for aggregation of several metrics utilizes machine learning to train models. A prototype solution has been implemented on a Hadoop platform to support scalability and to reduce hard ware cost by enabling the usage of commodity computers. The SNA algorithms have been adapted to efficiently execute on the MapReduce distributed platform. The prototype solution has been tested on a Hadoop cluster. The tests have verified that the solution can scale to support networks with millions of subscribers. Both real data from a telecom network operator with 2.4 million subscribers and synthetic data for networks up to 100 million subscribers have been used to verify the scalability and accuracy of the solution. The correlation between metrics have been analyzed to identify the information gain from each metric.