Distributed classification in peer-to-peer networks

Authors:
Ping Luo;Hui Xiong;Kevin Lü;Zhongzhi Shi
Affiliations:
Chinese Academy of Sciences, Beijing, China;Rutgers University, Newark, NJ;Brunel University, London, United Kingdom;Chinese Academy of Sciences, Beijing, China
Venue:
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2007

Citing 17
Cited 22

Bagging predictors

Machine Learning
The distributed boosting algorithm

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Using Correspondence Analysis to Combine Classifiers

Machine Learning
Pasting Small Votes for Classification in Large Databases and On-Line

Machine Learning
Combining Classifiers with Meta Decision Trees

Machine Learning
Performance analysis of pattern classifier combination by plurality voting

Pattern Recognition Letters
Task Execution Time Modeling for Heterogeneous Computing Systems

HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Is Combining Classifiers with Stacking Better than Selecting the Best One?

Machine Learning
Learning Ensembles from Bites: A Scalable and Accurate Approach

The Journal of Machine Learning Research
The price of validity in dynamic networks

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Gossip-based aggregation in large dynamic networks

ACM Transactions on Computer Systems (TOCS)
In-Network Outlier Detection in Wireless Sensor Networks

ICDCS '06 Proceedings of the 26th IEEE International Conference on Distributed Computing Systems
Distributed Data Mining in Peer-to-Peer Networks

IEEE Internet Computing
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Asynchronous distributed averaging on communication networks

IEEE/ACM Transactions on Networking (TON)
Automatic document organization in a p2p environment

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Association rule mining in peer-to-peer systems

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Cascade RSVM in Peer-to-Peer Networks

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Phone-to-phone communication for adaptive image classification

Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia
Communication-Efficient Classification in P2P Networks

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
A probabilistic model for compact document topic representation

SMO'09 Proceedings of the 9th WSEAS international conference on Simulation, modelling and optimization
Parallel Method for Mining High Utility Itemsets from Vertically Partitioned Distributed Databases

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
Hierarchical distributed data classification in wireless sensor networks

Computer Communications
On classifying drifting concepts in P2P networks

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Ubiquitous technologies

Ubiquitous knowledge discovery
Ubiquitous technologies

Ubiquitous knowledge discovery
Global peer-to-peer classification in mobile ad-hoc networks: a requirements analysis

CONTEXT'11 Proceedings of the 7th international and interdisciplinary conference on Modeling and using context
CLAP: Collaborative pattern mining for distributed information systems

Decision Support Systems
Hierarchical aggregate classification with limited supervision for data reduction in wireless sensor networks

Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems
A survey of emerging approaches to spam filtering

ACM Computing Surveys (CSUR)
Adaptive ensemble classification in p2p networks

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
DPSP: distributed progressive sequential pattern mining on the cloud

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Satrap: data and network heterogeneity aware P2P data-mining

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
A distributed asynchronous and privacy preserving neural network ensemble selection approach for peer-to-peer data mining

Proceedings of the Fifth Balkan Conference in Informatics
Peer-to-peer distributed text classifier learning in PADMINI

Statistical Analysis and Data Mining
Peer-to-peer multi-class boosting

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Classification in P2P networks with cascade support vector machines

ACM Transactions on Knowledge Discovery from Data (TKDD)
Extreme learning machine for classification over uncertain data

Neurocomputing
GoSCAN: Decentralized scalable data clustering

Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work studies the problem of distributed classification in peer-to-peer(P2P) networks. While there has been a significant amount of work in distributed classification, most of existing algorithms are not designed for P2P networks. Indeed, as server-less and router-less systems, P2P networks impose several challenges for distributed classification: (1) it is not practical to have global synchronization in large-scale P2P networks; (2)there are frequent topology changes caused by frequent failure and recovery of peers; and (3) there are frequent on-the-fly data updates on each peer. In this paper, we propose an ensemble paradigm for distributed classification in P2P networks. Under this paradigm, each peer builds its local classifiers on the local data and the results from all local classifiers are then combined by plurality voting. To build local classifiers, we adopt the learning algorithm of pasting bites to generate multiple local classifierson each peer based on the local data. To combine local results, we propose a general form of Distributed Plurality Voting (DPV) protocol in dynamic P2P networks. This protocol keeps the single-site validity for dynamic networks, and supports the computing modes of both one-shot query and continuous monitoring. We theoretically prove that the condition (BOB CHECK THIS 'C')ω0 for sending messages used in DPV0 is locally communication-optimal to achieve the above properties. Finally, experimental results on real-world P2P networks show that: (1) the proposed ensemble paradigm is effective even if there are thousands of local classifiers; (2) in most cases, the DPV0 algorithm is local in the sense that voting is processed using information gathered from a very small vicinity, whose size is independent of the network size; (3) DPV0 is significantly more communication-efficient than existing algorithms for distributed plurality voting.