Distributed cooperative mining for information consortia

Authors:
Satoshi Morinaga;Kenji Yamanishi;Jun-ichi Takeuchi
Affiliations:
NEC Corporation, Miyazaki, Miyamae, Kawasaki, Kanagawa;NEC Corporation, Miyazaki, Miyamae, Kawasaki, Kanagawa;NEC Corporation, Miyazaki, Miyamae, Kawasaki, Kanagawa
Venue:
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2003

Citing 8
Cited 0

Learning from a population of hypotheses

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Distributed cooperative Bayesian Learning strategies

Information and Computation
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Data mining "to go": ubiquitous KDD for mobile and distributed environments

Tutorial notes of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient distribution-free population learning of simple concepts

Annals of Mathematics and Artificial Intelligence
Efficient Mining of Association Rules in Distributed Databases

IEEE Transactions on Knowledge and Data Engineering
Collective Mining of Bayesian Networks from Distributed Heterogeneous Data

Knowledge and Information Systems
Statistical inference under multiterminal data compression

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the situation where a number of agents are distributed and each of them collects a data sequence generated according to an unknown probability distribution. Here each of the distributions is specified by common parameters and individual parameters e.g., a normal distribution with an identical mean and a different variance. Here we introduce a notion of an information consortium, which is a framework where the agents cannot show raw data to one another, but they like to enjoy significant information gain for estimating the respective distributions. Such an information consortium has recently received much interest in a broad range of areas including financial risk management, ubiquitous network mining, etc. In this paper we are concerned with the following three issues: 1) how to design a collaborative strategy for agents to estimate the respective distributions in the information consortium, 2) characterizing when each agent has a benefit in terms of information gain for estimating its distribution or information loss for predicting future data, and 3) charracterizing how much benefit each agent obtains. In this paper we yield a statistical formulation of information consortia and solve all of the above three problems for a general form of probability distributions. Specifically we propose a basic strategy for cooperative estimation and derive a necessary and sufficient condition for each agent to have a significant benefit.