Attention, intentions, and the structure of discourse
Computational Linguistics
On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
Topic segmentation with an aspect hidden Markov model
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A critique and improvement of an evaluation metric for text segmentation
Computational Linguistics
The Journal of Machine Learning Research
Optimal multi-paragraph text segmentation by dynamic programming
ACL '98 Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 2
Multi-paragraph segmentation of expository text
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A statistical model for domain-independent text segmentation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Modeling word burstiness using the Dirichlet distribution
ICML '05 Proceedings of the 22nd international conference on Machine learning
Unsupervised topic modelling for multi-party spoken discourse
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Minimum cut model for spoken lecture segmentation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Latent-Space Variational Bayes
IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian unsupervised topic segmentation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Evaluating hierarchical discourse segmentation
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
An iterative approach to text segmentation
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
How text segmentation algorithms gain from topic models
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Sweeping through the topic space: bad luck? Roll again!
ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
TopicTiling: a text segmentation algorithm based on LDA
ACL '12 Proceedings of ACL 2012 Student Research Workshop
On handling textual errors in latent document modeling
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Topic segmentation and labeling in asynchronous conversations
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
This paper presents a novel unsupervised method for hierarchical topic segmentation. Lexical cohesion -- the workhorse of unsupervised linear segmentation -- is treated as a multi-scale phenomenon, and formalized in a Bayesian setting. Each word token is modeled as a draw from a pyramid of latent topic models, where the structure of the pyramid is constrained to induce a hierarchical segmentation. Inference takes the form of a coordinate-ascent algorithm, iterating between two steps: a novel dynamic program for obtaining the globally-optimal hierarchical segmentation, and collapsed variational Bayesian inference over the hidden variables. The resulting system is fast and accurate, and compares well against heuristic alternatives.