Latent variable models and factors analysis
Latent variable models and factors analysis
Elements of information theory
Elements of information theory
Factorial Hidden Markov Models
Machine Learning - Special issue on learning with probabilistic representations
Learning in graphical models
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Neural Computation
An Introduction to Variational Methods for Graphical Models
Machine Learning
Variational Extensions to EM and Multinomial PCA
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Learning to Probabilistically Identify Authoritative Documents
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The Journal of Machine Learning Research
Mean field theory for sigmoid belief networks
Journal of Artificial Intelligence Research
Variational probabilistic inference and the QMR-DT network
Journal of Artificial Intelligence Research
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Exact inference of hidden structure from sample data in noisy-OR networks
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Preserving the privacy of sensitive relationships in graph data
PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD
A nonnegative blind source separation model for binary test data
IEEE Transactions on Circuits and Systems Part I: Regular Papers
Mining citation information from CiteSeer data
Scientometrics
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Multi-assignment clustering for boolean data
The Journal of Machine Learning Research
Hi-index | 0.00 |
We develop a new component analysis framework, the Noisy-Or Component Analyzer (NOCA), that targets high-dimensional binary data. NOCA is a probabilistic latent variable model that assumes the expression of observed high-dimensional binary data is driven by a small number of hidden binary sources combined via noisy-or units. The component analysis procedure is equivalent to learning of NOCA parameters. Since the classical EM formulation of the NOCA learning problem is intractable, we develop its variational approximation. We test the NOCA framework on two problems: (1) a synthetic image-decomposition problem and (2) a co-citation data analysis problem for thousands of CiteSeer documents. We demonstrate good performance of the new model on both problems. In addition, we contrast the model to two mixture-based latent-factor models: the probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA). Differing assumptions underlying these models cause them to discover different types of structure in co-citation data, thus illustrating the benefit of NOCA in building our understanding of high-dimensional data sets.