A Bayesian method for constructing Bayesian belief networks from databases
Proceedings of the seventh conference (1991) on Uncertainty in artificial intelligence
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining complex models from arbitrarily large databases in constant time
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An accelerated Chow and Liu algorithm: fitting tree distributions to high dimensional sparse data
An accelerated Chow and Liu algorithm: fitting tree distributions to high dimensional sparse data
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data
IEEE Transactions on Knowledge and Data Engineering
Cached sufficient statistics for efficient machine learning with large datasets
Journal of Artificial Intelligence Research
Fast learning from sparse data
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning bayesian network structure from massive datasets: the «sparse candidate« algorithm
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Empirical analysis of predictive algorithms for collaborative filtering
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Bayes net graphs to understand co-authorship networks?
Proceedings of the 3rd international workshop on Link discovery
ICML '06 Proceedings of the 23rd international conference on Machine learning
Dependency trees in sub-linear time and bounded memory
The VLDB Journal — The International Journal on Very Large Data Bases
Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm
The Journal of Machine Learning Research
Bayesian Substructure Learning - Approximate Learning of Very Large Network Structures
ECML '07 Proceedings of the 18th European conference on Machine Learning
Folksonomy-Based Collabulary Learning
ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
Learning dynamic temporal graphs for oil-production equipment monitoring system
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Markov blanket feature selection for support vector machines
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Exploiting association and correlation rules parameters for learning Bayesian networks
Intelligent Data Analysis
A Survey of Statistical Network Models
Foundations and Trends® in Machine Learning
Learning approximate MRFs from large transactional data
ICML'06 Proceedings of the 2006 conference on Statistical network analysis
Learning approximate MRFs from large transaction data
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Review: learning bayesian networks: Approaches and issues
The Knowledge Engineering Review
Learning hierarchical bayesian networks for large-scale data analysis
ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
Hi-index | 0.00 |
This paper addresses three questions. Is it useful to attempt to learn a Bayesian network structure with hundreds of thousands of nodes? How should such structure search proceed practically? The third question arises out of our approach to the second: how can Frequent Sets (Agrawal et al., 1993), which are extremely popular in the area of descriptive data mining, be turned into a probabilistic model?Large sparse datasets with hundreds of thousands of records and attributes appear in social networks, warehousing, supermarket transactions and web logs. The complexity of structural search made learning of factored probabilistic models on such datasets unfeasible. We propose to use Frequent Sets to significantly speed up the structural search. Unlike previous approaches, we not only cache n-way sufficient statistics, but also exploit their local structure. We also present an empirical evaluation of our algorithm applied to several massive datasets.