Satrap: data and network heterogeneity aware P2P data-mining

  • Authors:
  • Hock Hee Ang;Vivekanand Gopalkrishnan;Anwitaman Datta;Wee Keong Ng;Steven C H. Hoi

  • Affiliations:
  • Nanyang Technological University, Singapore;Nanyang Technological University, Singapore;Nanyang Technological University, Singapore;Nanyang Technological University, Singapore;Nanyang Technological University, Singapore

  • Venue:
  • PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed classification aims to build an accurate classifier by learning from distributed data while reducing computation and communication cost A P2P network where numerous users come together to share resources like data content, bandwidth, storage space and CPU resources is an excellent platform for distributed classification However, two important aspects of the learning environment have often been overlooked by other works, viz., 1) location of the peers which results in variable communication cost and 2) heterogeneity of the peers' data which can help reduce redundant communication In this paper, we examine the properties of network and data heterogeneity and propose a simple yet efficient P2P classification approach that minimizes expensive inter-region communication while achieving good generalization performance Experimental results demonstrate the feasibility and effectiveness of the proposed solution.