Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
The Journal of Machine Learning Research
Probabilistic author-topic models for information discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Incorporating domain knowledge into topic modeling via Dirichlet Forest priors
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning author-topic models from text corpora
ACM Transactions on Information Systems (TOIS)
Markovian analysis for automatic new topic identification in search engine transaction logs
Applied Stochastic Models in Business and Industry
Hi-index | 0.00 |
This article proposes a method for Pareto charting that is based on unsupervised, freestyle text such as customer complaint, rework, scrap, or maintenance event descriptions. The proposed procedure is based on a slight extension of the latent Dirichlet allocation method to form multifield latent Dirichlet allocation. The extension is the usage of field-specific dictionaries for multifield databases and changes to recommended default prior settings. We use a numerical study to motivate the prior setting selection. A real-world case study associated with user reviews of Toyota Camry vehicles is used to illustrate the practical value of the proposed methods. The results indicate that only 4% of the words written by Consumer Reports reviewers from the last 10 years relate to the widely publicized unintended acceleration issue. Copyright © 2012 John Wiley & Sons, Ltd.