DisTec: Towards a Distributed System for Telecom Computing

  • Authors:
  • Shengqi Yang;Bai Wang;Haizhou Zhao;Yuan Gao;Bin Wu

  • Affiliations:
  • Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, China;Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, China;Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, China;Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, China;Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, China

  • Venue:
  • CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

The continued exponential growth in both the volume and the complexity of information, compared with the computing capacity of the silicon-based devices restricted by Moore's Law, is giving birth to a new challenge to the specific requirements of analysts, researchers and intelligence providers. With respect to this challenge, a new class of techniques and computing platforms, such as Map-Reduce model, which mainly focus on scalability and parallelism, has been emerging. In this paper, to move the scientific prototype forward to practice, we elaborate a prototype of our applied distributed system, DisTec , for knowledge discovery from social network perspective in the field of telecommunications. The major infrastructure is constructed on Hadoop, an open-source counterpart of Google's Map-Reduce. We carefully devised our system to undertake the mining tasks in terabytes call records. To illustrate its functionality, DisTec is applied to real-world large-scale telecom dataset. The experiments range from initial raw data preprocessing to final knowledge extraction. We demonstrate that our system has a good performance in such cloud-scale data computing.