Statistical Models for Text Segmentation
Machine Learning - Special issue on natural language learning
A critique and improvement of an evaluation metric for text segmentation
Computational Linguistics
The Journal of Machine Learning Research
Advances in domain independent linear text segmentation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Multi-paragraph segmentation of expository text
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A Dynamic Programming Algorithm for Linear Text Segmentation
Journal of Intelligent Information Systems
A statistical model for domain-independent text segmentation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Discourse segmentation of multi-party conversation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Penn Treebank: annotating predicate argument structure
HLT '94 Proceedings of the workshop on Human Language Technology
Text segmentation with LDA-based Fisher kernel
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Hierarchical text segmentation from multi-scale lexical cohesion
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Text segmentation via topic modeling: an analytical study
Proceedings of the 18th ACM conference on Information and knowledge management
Revising the wordnet domains hierarchy: semantics, coverage and balancing
MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources
How text segmentation algorithms gain from topic models
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Sweeping through the topic space: bad luck? Roll again!
ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
Exploiting hybrid contexts for Tweet segmentation
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
An unsupervised topic segmentation model incorporating word order
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Automatically generated spam detection based on sentence-level topic information
Proceedings of the 22nd international conference on World Wide Web companion
Unsupervised text segmentation using LDA and MCMC
AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Hi-index | 0.00 |
This work presents a Text Segmentation algorithm called TopicTiling. This algorithm is based on the well-known TextTiling algorithm, and segments documents using the Latent Dirichlet Allocation (LDA) topic model. We show that using the mode topic ID assigned during the inference method of LDA, used to annotate unseen documents, improves performance by stabilizing the obtained topics. We show significant improvements over state of the art segmentation algorithms on two standard datasets. As an additional benefit, TopicTiling performs the segmentation in linear time and thus is computationally less expensive than other LDA-based segmentation methods.