A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic

Authors:
David Andrzejewski;Xiaojin Zhu;Mark Craven;Benjamin Recht
Affiliations:
Lawrence Livermore National Laboratory;University of Wisconsin-Madison;University of Wisconsin-Madison;University of Wisconsin-Madison
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Year:
2011

Citing 20
Cited 8

Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
Latent dirichlet allocation

The Journal of Machine Learning Research
Markov logic networks

Machine Learning
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Statistical Debugging Using Latent Topic Models

ECML '07 Proceedings of the 18th European conference on Machine Learning
Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

The Journal of Machine Learning Research
Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning

ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
Incorporating domain knowledge into topic modeling via Dirichlet Forest priors

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Sound and efficient inference with probabilistic and deterministic dependencies

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Get out the vote: determining support or opposition from congressional floor-debate transcripts

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Max-Margin Weight Learning for Markov Logic Networks

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Lifted first-order belief propagation

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Hybrid Markov logic networks

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Speeding up inference in Markov logic networks by preprocessing to reduce the size of the resulting grounded network

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Counting belief propagation

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Markov Logic: An Interface Layer for Artificial Intelligence

Markov Logic: An Interface Layer for Artificial Intelligence
Interactive topic modeling

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Incorporating domain knowledge in latent topic models

Incorporating domain knowledge in latent topic models
Mirror descent and nonlinear projected subgradient methods for convex optimization

Operations Research Letters

Evaluating unsupervised learning for natural language processing tasks

EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Transparent user models for personalization

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-dimensional analysis of political documents

NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
Learning from bullying traces in social media

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Aspect extraction through semi-supervised modeling

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Optimizing temporal topic segmentation for intelligent text visualization

Proceedings of the 2013 international conference on Intelligent user interfaces
Discovering coherent topics using general knowledge

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Leveraging multi-domain prior knowledge in topic models

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Topic models have been used successfully for a variety of problems, often in the form of application-specific extensions of the basic Latent Dirichlet Allocation (LDA) model. Because deriving these new models in order to encode domain knowledge can be difficult and time-consuming, we propose the Foldċall model, which allows the user to specify general domain knowledge in First-Order Logic (FOL). However, combining topic modeling with FOL can result in inference problems beyond the capabilities of existing techniques. We have therefore developed a scalable inference technique using stochastic gradient descent which may also be useful to the Markov Logic Network (MLN) research community. Experiments demonstrate the expressive power of Foldċall, as well as the scalability of our proposed inference method.