IEEE Transactions on Knowledge and Data Engineering
Gossip-based aggregation in large dynamic networks
ACM Transactions on Computer Systems (TOCS)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
ACM Transactions on Computer Systems (TOCS)
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Gossiping in distributed systems
ACM SIGOPS Operating Systems Review - Gossip-based computer networking
Fully distributed EM for very large datasets
Proceedings of the 25th international conference on Machine learning
ACM Computing Surveys (CSUR)
Knowledge Propagation in Collaborative Tagging for Image Retrieval
Journal of Signal Processing Systems
A distributed EM algorithm to estimate the parameters of a finite mixture of components
Knowledge and Information Systems
Privacy and confidentiality in context-based and epidemic forwarding
Computer Communications
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Distributed EM Algorithm for Gaussian Mixtures in Sensor Networks
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Data sharing services on the web host huge amounts of resources supplied and accessed by millions of users around the world. While the classical approach is a central control over the data set, even if this data set is distributed, there is growing interesting in decentralized solutions, because of good properties (in particularity, privacy and scaling up). In this paper, we explore a machine learning side of this work direction. We propose a novel technique for decentralized estimation of probabilistic mixture models, which are among the most versatile generative models for understanding data sets. More precisely, we demonstrate how to estimate a global mixture model from a set of local models. Our approach accommodates dynamic topology and data sources and is statistically robust, i.e. resilient to the presence of unreliable local models. Such outlier models may arise from local data which are outliers, compared to the global trend, or poor mixture estimation. We report experiments on synthetic data and real geo-location data from Flickr.