A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic

  • Authors:
  • David Andrzejewski;Xiaojin Zhu;Mark Craven;Benjamin Recht

  • Affiliations:
  • Lawrence Livermore National Laboratory;University of Wisconsin-Madison;University of Wisconsin-Madison;University of Wisconsin-Madison

  • Venue:
  • IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Topic models have been used successfully for a variety of problems, often in the form of application-specific extensions of the basic Latent Dirichlet Allocation (LDA) model. Because deriving these new models in order to encode domain knowledge can be difficult and time-consuming, we propose the Foldċall model, which allows the user to specify general domain knowledge in First-Order Logic (FOL). However, combining topic modeling with FOL can result in inference problems beyond the capabilities of existing techniques. We have therefore developed a scalable inference technique using stochastic gradient descent which may also be useful to the Markov Logic Network (MLN) research community. Experiments demonstrate the expressive power of Foldċall, as well as the scalability of our proposed inference method.