BC-PDM: data mining, social network analysis and text mining system based on cloud computing

  • Authors:
  • Le Yu;Jian Zheng;Wei Chong Shen;Bin Wu;Bai Wang;Long Qian;Bo Ren Zhang

  • Affiliations:
  • Beijing University of Posts and Telecommunication, Beijing, China;Beijing University of Posts and Telecommunication, Beijing, China;Beijing University of Posts and Telecommunication, Beijing, China;Beijing University of Posts and Telecommunication, Beijing, China;Beijing University of Posts and Telecommunication, Beijing, China;Beijing University of Posts and Telecommunication, Beijing, China;Beijing University of Posts and Telecommunication, Beijing, China

  • Venue:
  • Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Telecom BI(Business Intelligence) system consists of a set of application programs and technologies for gathering, storing, analyzing and providing access to data, which contribute to manage business information and make decision precisely. However, traditional analysis algorithms meet new challenges as the continued exponential growth in both the volume and the complexity of telecom data. With the Cloud Computing development, some parallel data analysis systems have been emerging. However, existing systems have rarely comprehensive function, either providing data analysis service or providing social network analysis. We need a comprehensive tool to store and analysis large scale data efficiently. In response to the challenge, the SaaS (Software-as-a-Service) BI system, BC-PDM (Big Cloud-Parallel Data Mining), are proposed. BC-PDM supports parallel ETL process, statistical analysis, data mining, text mining and social network analysis which are based on Hadoop. This demo introduces three tasks: business recommendation, customer community detection and user preference classification by employing a real telecom data set. Experimental results show BC-PDM is very efficient and effective for intelligence data analysis.