One theme in all views: modeling consensus topics in multiple contexts

Authors:
Jian Tang;Ming Zhang;Qiaozhu Mei
Affiliations:
Peking University, Beijing, China;Peking University, Beijing, China;University of Michigan, Ann Arbor, USA
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 20
Cited 1

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
Multi-View Clustering

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
The author-topic model for authors and documents

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
LDA-based document models for ad-hoc retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Topics over time: a non-Markov continuous-time model of topical trends

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Topic modeling with network regularization

Proceedings of the 17th international conference on World Wide Web
Modeling online reviews with multi-grain topic models

Proceedings of the 17th international conference on World Wide Web
Opinion integration through semi-supervised topic modeling

Proceedings of the 17th international conference on World Wide Web
Joint sentiment/topic model for sentiment analysis

Proceedings of the 18th ACM conference on Information and knowledge management
TwitterRank: finding topic-sensitive influential twitterers

Proceedings of the third ACM international conference on Web search and data mining
Automatic evaluation of topic coherence

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Staying informed: supervised and semi-supervised multi-view topical analysis of ideological perspective

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Aspect and sentiment unification model for online review analysis

Proceedings of the fourth ACM international conference on Web search and data mining
Geographical topic discovery and comparison

Proceedings of the 20th international conference on World wide web
Empirical study of topic modeling in Twitter

Proceedings of the First Workshop on Social Media Analytics
Comparing twitter and traditional media using topic models

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
LPTA: A Probabilistic Model for Latent Periodic Topic Analysis

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
Optimizing semantic coherence in topic models

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Discovering geographical topics in the twitter stream

Proceedings of the 21st international conference on World Wide Web

The dual-sparse topic model: mining focused topics and focused terms in short text

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

New challenges have been presented to classical topic models when applied to social media, as user-generated content suffers from significant problems of data sparseness. A variety of heuristic adjustments to these models have been proposed, many of which are based on the use of context information to improve the performance of topic modeling. Existing contextualized topic models rely on arbitrary manipulation of the model structure, by incorporating various context variables into the generative process of classical topic models in an ad hoc manner. Such manipulations usually result in much more complicated model structures, sophisticated inference procedures, and low generalizability to accommodate arbitrary types or combinations of contexts. In this paper we explore a different direction. We propose a general solution that is able to exploit multiple types of contexts without arbitrary manipulation of the structure of classical topic models. We formulate different types of contexts as multiple views of the partition of the corpus. A co-regularization framework is proposed to let these views collaborate with each other, vote for the consensus topics, and distinguish them from view-specific topics. Experiments with real-world datasets prove that the proposed method is both effective and flexible to handle arbitrary types of contexts.