Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Modeling local coherence: an entity-based approach
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Use of ranked cross document evidence trails for hypothesis generation
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the Second ACM International Conference on Web Search and Data Mining
WordNet: similarity - measuring the relatedness of concepts
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
DUC 2005: evaluation of question-focused summarization systems
SumQA '06 Proceedings of the Workshop on Task-Focused Summarization and Question Answering
Joint Emotion-Topic Modeling for Social Affective Text Mining
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
The topic-perspective model for social tagging systems
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic evaluation of topic coherence
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
How many words is a picture worth? Automatic caption generation for news images
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Discovering different types of topics: factored topic models
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Comment-based multi-view clustering of web 2.0 items
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
This paper explores correspondence and mixture topic modeling of documents tagged from two different perspectives. There has been ongoing work in topic modeling of documents with tags (tag-topic models) where words and tags typically reflect a single perspective, namely document content. However, words in documents can also be tagged from different perspectives, for example, syntactic perspective as in part-of-speech tagging or an opinion perspective as in sentiment tagging. The models proposed in this paper are novel in: (i) the consideration of two different tag perspectives -- a document level tag perspective that is relevant to the document as a whole and a word level tag perspective pertaining to each word in the document; (ii) the attribution of latent topics with word level tags and labeling latent topics with images in case of multimedia documents; and (iii) discovering the possible correspondence of the words to document level tags. The proposed correspondence tag-topic model shows better predictive power i.e. higher likelihood on heldout test data than all existing tag topic models and even a supervised topic model. To evaluate the models in practical scenarios, quantitative measures between the outputs of the proposed models and the ground truth domain knowledge have been explored. Manually assigned (gold standard) document category labels in Wikipedia pages are used to validate model-generated tag suggestions using a measure of pairwise concept similarity within an ontological hierarchy like WordNet. Using a news corpus, automatic relationship discovery between person names was performed and compared to a robust baseline.