Tractable learning of large Bayes net structures from sparse data

Authors:
Anna Goldenberg;Andrew Moore
Affiliations:
Center for Automated Learning and Discovery, Pittsburgh, PA;Robotics Institute, Pittsburgh, PA
Venue:
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Year:
2004

Citing 12
Cited 15

A Bayesian method for constructing Bayesian belief networks from databases

Proceedings of the seventh conference (1991) on Uncertainty in artificial intelligence
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

Machine Learning
Data mining: concepts and techniques

Data mining: concepts and techniques
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining complex models from arbitrarily large databases in constant time

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An accelerated Chow and Liu algorithm: fitting tree distributions to high dimensional sparse data

An accelerated Chow and Liu algorithm: fitting tree distributions to high dimensional sparse data
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data

IEEE Transactions on Knowledge and Data Engineering
Cached sufficient statistics for efficient machine learning with large datasets

Journal of Artificial Intelligence Research
Fast learning from sparse data

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning bayesian network structure from massive datasets: the «sparse candidate« algorithm

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Empirical analysis of predictive algorithms for collaborative filtering

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Bayes net graphs to understand co-authorship networks?

Proceedings of the 3rd international workshop on Link discovery
Sequential update of ADtrees

ICML '06 Proceedings of the 23rd international conference on Machine learning
Dependency trees in sub-linear time and bounded memory

The VLDB Journal — The International Journal on Very Large Data Bases
The max-min hill-climbing Bayesian network structure learning algorithm

Machine Learning
Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm

The Journal of Machine Learning Research
Bayesian Substructure Learning - Approximate Learning of Very Large Network Structures

ECML '07 Proceedings of the 18th European conference on Machine Learning
Folksonomy-Based Collabulary Learning

ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
Learning dynamic temporal graphs for oil-production equipment monitoring system

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Markov blanket feature selection for support vector machines

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Exploiting association and correlation rules parameters for learning Bayesian networks

Intelligent Data Analysis
A Survey of Statistical Network Models

Foundations and Trends® in Machine Learning
Learning approximate MRFs from large transactional data

ICML'06 Proceedings of the 2006 conference on Statistical network analysis
Learning approximate MRFs from large transaction data

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Review: learning bayesian networks: Approaches and issues

The Knowledge Engineering Review
Learning hierarchical bayesian networks for large-scale data analysis

ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses three questions. Is it useful to attempt to learn a Bayesian network structure with hundreds of thousands of nodes? How should such structure search proceed practically? The third question arises out of our approach to the second: how can Frequent Sets (Agrawal et al., 1993), which are extremely popular in the area of descriptive data mining, be turned into a probabilistic model?Large sparse datasets with hundreds of thousands of records and attributes appear in social networks, warehousing, supermarket transactions and web logs. The complexity of structural search made learning of factored probabilistic models on such datasets unfeasible. We propose to use Frequent Sets to significantly speed up the structural search. Unlike previous approaches, we not only cache n-way sufficient statistics, but also exploit their local structure. We also present an empirical evaluation of our algorithm applied to several massive datasets.