Attribute reduction for massive data based on rough set theory and MapReduce

Authors:
Yong Yang;Zhengrong Chen;Zhu Liang;Guoyin Wang
Affiliations:
Institute of Computer Science & Technology, Chongqing University of Posts and Telecommunications, Chongqing, P.R. China;Institute of Computer Science & Technology, Chongqing University of Posts and Telecommunications, Chongqing, P.R. China;Institute of Computer Science & Technology, Chongqing University of Posts and Telecommunications, Chongqing, P.R. China;Institute of Computer Science & Technology, Chongqing University of Posts and Telecommunications, Chongqing, P.R. China
Venue:
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
Year:
2010

Citing 5
Cited 1

Rough classification

International Journal of Man-Machine Studies
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
MapReduce for Data Intensive Scientific Analyses

ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Scaling Genetic Algorithms Using MapReduce

ISDA '09 Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications

A parallel method for computing rough set approximations

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data processing and knowledge discovery for massive data is always a hot topic in data mining, along with the era of cloud computing is coming, data mining for massive data is becoming a highlight research topic. In this paper, attribute reduction for massive data based on rough set theory is studied. The parallel programming mode of MapReduce is introduced and combined with the attribute reduction algorithm of rough set theory, a parallel attribute reduction algorithm based on MapReduce is proposed, experiment results show that the proposed method is more efficiency for massive data mining than traditional method, and it is a effective method effective method effective method for data mining on cloud computing platform.