Incorporating word correlation into tag-topic model for semantic knowledge acquisition

  • Authors:
  • Fang Li;Tingting He;Xinhui Tu;Xiaohua Hu

  • Affiliations:
  • Central China Normal University, Wuhan, China;Central China Normal University, Wuhan, China;Central China Normal University, Wuhan, China;Central China Normal University, Wuhan, China & Drexel University, Philadelphia, PA, USA

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a tag-topic model with Dirichlet Forest prior (TTM-DF) for semantic knowledge acquisition from blog. The TTM-DF model extends the tag-topic model (TTM) by replacing the Dirichlet prior with the Dirichlet Forest prior over the topic-word multinomial. The correlation between words are calculated to generate a set of Must-Links and Cannot-Links, then the structures of Dirichlet trees are obtained though encoding the constraints of Must-Links and Cannot-Links. Words under the same subtrees are expected to be more correlated than words under different subtrees. We conduct experiments on a synthetic and a blog dataset. Both of the experimental results show that the TTM-DF model performs much better than the TTM model. It can improve the coherence of the underlying topics and the tag-topic distributions, and capture semantic knowledge effectively.