Distributed data mining for e-business

  • Authors:
  • Bin Liu;Shu Gui Cao;Wu He

  • Affiliations:
  • State Key Laboratory of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing, China 100084 and College of Economics and Management, Hebei ...;College of Economics and Management, Hebei University of Science and Technology, Shijiazhuang, China 050018;Center for Learning Technologies, Old Dominion University, Norfolk, USA 23529

  • Venue:
  • Information Technology and Management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the internet-based e-business environment, most business data are distributed, heterogeneous and private. To achieve true business intelligence, mining large amounts of distributed data is necessary. Through a thorough literature review, this paper identifies four main issues in distributed data mining (DDM) systems for e-business and classifies modern DDM systems into three classes with representative samples. To address these identified issues, this paper proposes a novel DDM model named DRHPDM (Data source Relevance-based Hierarchical Parallel Distributed data mining Model). In addition, to improve the quality of the final result, the data sources are divided into a centralized mining layer and a distributed mining layer, according to their relevance. To improve the openness, cross-platform ability, and intelligence of the DDM system, web service and multi-agent technologies are adopted. The feasibility of DRHPDM was verified by building a prototype system and applying it to a web usage mining scenario.