Conditional topical coding: an efficient topic model conditioned on rich features

  • Authors:
  • Jun Zhu;Ni Lao;Ning Chen;Eric P. Xing

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Tsinghua University, Beijing, China;Carnegie Mellon University, Pittsburgh, PA, USA

  • Venue:
  • Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Probabilistic topic models have shown remarkable success in many application domains. However, a probabilistic conditional topic model can be extremely inefficient when considering a rich set of features because it needs to define a normalized distribution, which usually involves a hard-to-compute partition function. This paper presents conditional topical coding (CTC), a novel formulation of conditional topic models which is non-probabilistic. CTC relaxes the normalization constraints as in probabilistic models and learns non-negative document codes and word codes. CTC does not need to define a normalized distribution and can efficiently incorporate a rich set of features for improved topic discovery and prediction tasks. Moreover, CTC can directly control the sparsity of inferred representations by using appropriate regularization. We develop an efficient and easy-to-implement coordinate descent learning algorithm, of which each coding substep has a closed-form solution. Finally, we demonstrate the advantages of CTC on online review analysis datasets. Our results show that conditional topical coding can achieve state-of-the-art prediction performance and is much more efficient in training (one order of magnitude faster) and testing (two orders of magnitude faster) than probabilistic conditional topic models.