LSA-PTM: a propagation-based topic model using latent semantic analysis on heterogeneous information networks

  • Authors:
  • Qian Wang;Zhaohui Peng;Fei Jiang;Qingzhong Li

  • Affiliations:
  • School of Computer Science and Technology, Shandong University, Jinan, China,Shandong Provincial Key Laboratory of Software Engineering, China;School of Computer Science and Technology, Shandong University, Jinan, China,Shandong Provincial Key Laboratory of Software Engineering, China;School of Computer Science and Technology, Shandong University, Jinan, China,Shandong Provincial Key Laboratory of Software Engineering, China;School of Computer Science and Technology, Shandong University, Jinan, China,Shandong Provincial Key Laboratory of Software Engineering, China

  • Venue:
  • WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Topic modeling on information networks is important for data analysis. Although there are many advanced techniques for this task, few methods either consider it into heterogeneous information networks or the readability of discovered topics. In this paper, we study the problem of topic modeling on heterogeneous information networks by putting forward LSA-PTM. LSA-PTM first extracts meaningful frequent phrases from documents captured from heterogeneous information network. Subsequently, latent semantic analysis is conducted on these phrases, which can obtain the inherent topics of the documents. Then we introduce a topic propagation method that propagates the topics obtained by LSA on the heterogeneous information network via the links between different objects, which can optimize the topics and identify clusters of multi-typed objects simultaneously. To make the topics more understandable, a topic description is calculated for each discovered topic. We apply LSA-PTM on real data, and experimental results prove its effectiveness.