The Journal of Machine Learning Research
Applying discrete PCA in data analysis
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
ICML '06 Proceedings of the 23rd international conference on Machine learning
Subject metadata enrichment using statistical topic models
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Organizing the OCA: learning faceted subjects from a library of digital books
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Automatic labeling of multinomial topic models
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Incorporating domain knowledge into topic modeling via Dirichlet Forest priors
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Improvements that don't add up: ad-hoc retrieval results since 1998
Proceedings of the 18th ACM conference on Information and knowledge management
Automatic evaluation of topic coherence
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Automatic evaluation of topic coherence
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Word order matters: measuring topic coherence with lexical argument structure
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Are learned topics more useful than subject headings
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Latent topic feedback for information retrieval
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
TopicNets: Visual Analysis of Large Text Corpora with Topic Modeling
ACM Transactions on Intelligent Systems and Technology (TIST)
Termite: visualization techniques for assessing textual topic models
Proceedings of the International Working Conference on Advanced Visual Interfaces
Evaluating unsupervised ensembles when applied to word sense induction
ACL '12 Proceedings of ACL 2012 Student Research Workshop
Exploring topic coherence over many models and many topics
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Evaluating the use of clustering for automatically organising digital library collections
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
An approach to query relaxation using ontologies in a GIS-based archiving system
Proceedings of the Third ACM SIGSPATIAL International Workshop on GeoStreaming
Discovering coherent topics using general knowledge
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Leveraging multi-domain prior knowledge in topic models
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Topic models could have a huge impact on improving the ways users find and discover content in digital libraries and search interfaces through their ability to automatically learn and apply subject tags to each and every item in a collection, and their ability to dynamically create virtual collections on the fly. However, much remains to be done to tap this potential, and empirically evaluate the true value of a given topic model to humans. In this work, we sketch out some sub-tasks that we suggest pave the way towards this goal, and present methods for assessing the coherence and interpretability of topics learned by topic models. Our large-scale user study includes over 70 human subjects evaluating and scoring almost 500 topics learned from collections from a wide range of genres and domains. We show how scoring model -- based on pointwise mutual information of word-pair using Wikipedia, Google and MEDLINE as external data sources - performs well at predicting human scores. This automated scoring of topics is an important first step to integrating topic modeling into digital libraries