Asynchronous peer-to-peer data mining with stochastic gradient descent

Authors:
Róbert Ormándi;István Hegedus;Márk Jelasity
Affiliations:
University of Szeged, Hungary;University of Szeged, Hungary;University of Szeged and Hungarian Academy of Sciences, Hungary
Venue:
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Year:
2011

Citing 24
Cited 1

Making large-scale support vector machine learning practical

Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Peer-to-peer based recommendations for mobile commerce

WMC '01 Proceedings of the 1st international workshop on Mobile commerce
Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining

ACM Transactions on Computer Systems (TOCS)
Gossip-Based Computation of Aggregate Information

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Gossip-based aggregation in large dynamic networks

ACM Transactions on Computer Systems (TOCS)
Peer counting and sampling in overlay networks: random walk methods

Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
A distributed host-based worm detection system

Proceedings of the 2006 SIGCOMM workshop on Large-scale attack defense
Understanding churn in peer-to-peer networks

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Training a Support Vector Machine in the Primal

Neural Computation
Gossip-based peer sampling

ACM Transactions on Computer Systems (TOCS)
TRIBLER: a social-based peer-to-peer system: Research Articles

Concurrency and Computation: Practice & Experience - Recent Advances in Peer-to-Peer Systems and Security (P2P 2006)
Identifying suspicious URLs: an application of large-scale online learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
PeerSoN: P2P social networking: early experiences and insights

Proceedings of the Second ACM EuroSys Workshop on Social Network Systems
Approximate Distributed K-Means Clustering over a Peer-to-Peer Network

IEEE Transactions on Knowledge and Data Engineering
Adaptive Peer Sampling with Newscast

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Collaborative filtering using random neighbours in peer-to-peer networks

Proceedings of the 1st ACM international workshop on Complex networks meet information & knowledge management
Gossiping personalized queries

Proceedings of the 13th International Conference on Extending Database Technology
Overlay management for fully distributed user-based collaborative filtering

EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Pegasos: primal estimated sub-gradient solver for SVM

Mathematical Programming: Series A and B - Special Issue on "Optimization and Machine learning"; Alexandre d’Aspremont • Francis Bach • Inderjit S. Dhillon • Bin Yu
T-Man: gossip-based overlay topology management

ESOA'05 Proceedings of the Third international conference on Engineering Self-Organising Systems
Automatic document organization in a p2p environment

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Fast Distributed Algorithms for Computing Separable Functions

IEEE Transactions on Information Theory

Peer-to-peer multi-class boosting

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Fully distributed data mining algorithms build global models over large amounts of data distributed over a large number of peers in a network, without moving the data itself. In the area of peer-to-peer (P2P) networks, such algorithms have various applications in P2P social networking, and also in trackerless BitTorrent communities. The difficulty of the problem involves realizing good quality models with an affordable communication complexity, while assuming as little as possible about the communication model. Here we describe a conceptually simple, yet powerful generic approach for designing efficient, fully distributed, asynchronous, local algorithms for learning models of fully distributed data. The key idea is that many models perform a random walk over the network while being gradually adjusted to fit the data they encounter, using a stochastic gradient descent search. We demonstrate our approach by implementing the support vector machine (SVM) method and by experimentally evaluating its performance in various failure scenarios over different benchmark datasets. Our algorithm scheme can implement a wide range of machine learning methods in an extremely robust manner.