The Journal of Machine Learning Research
Topics over time: a non-Markov continuous-time model of topical trends
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Expertise modeling for matching papers with reviewers
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Opinion integration through semi-supervised topic modeling
Proceedings of the 17th international conference on World Wide Web
Structured entity identification and document categorization: two tasks with one joint model
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Locating complex named entities in web text
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Hi-index | 0.00 |
Enterprises have accumulated both structured and unstructured data steadily as computing resources improve. However, previous research on enterprise data mining often treats these two kinds of data independently and omits mutual benefits. We explore the approach to incorporate a common type of structured data (i.e. organigram) into generative topic model. Our approach, the Partially Observed Topic model (POT), not only considers the unstructured words, but also takes into account the structured information in its generation process. By integrating the structured data implicitly, the mixed topics over document are partially observed during the Gibbs sampling procedure. This allows POT to learn topic pertinently and directionally, which makes it easy tuning and suitable for end-use application. We evaluate our proposed new model on a real-world dataset and show the result of improved expressiveness over traditional LDA. In the task of document classification, POT also demonstrates more discriminative power than LDA.