Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Independent component analysis, a new concept?
Signal Processing - Special issue on higher order statistics
A maximum entropy approach to natural language processing
Computational Linguistics
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast discovery of association rules
Advances in knowledge discovery and data mining
Latent semantic indexing: a probabilistic analysis
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic query models for transaction data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic Models for Query Approximation with Large Sparse Binary Data Sets
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Maximally informative k-itemsets and their efficient discovery
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Large 0--1 datasets arise in various applications, such as market basket analysis and information retrieval. We concentrate on the study of topic models, aiming at results which indicate why certain methods succeed or fail. We describe simple algorithms for finding topic models from 0--1 data. We give theoretical results showing that the algorithms can discover the epsilon-separable topic models of Papadimitriou et al. We present empirical results showing that the algorithms find natural topics in real-world data sets. We also briefly discuss the connections to matrix approaches, including nonnegative matrix factorization and independent component analysis.