Statistical Models for Text Segmentation
Machine Learning - Special issue on natural language learning
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
A critique and improvement of an evaluation metric for text segmentation
Computational Linguistics
Domain-independent text segmentation using anisotropic diffusion and dynamic programming
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Topic segmentation: algorithms and applications
Topic segmentation: algorithms and applications
The Journal of Machine Learning Research
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
Advances in domain independent linear text segmentation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Multi-paragraph segmentation of expository text
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A statistical model for domain-independent text segmentation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
Minimum cut model for spoken lecture segmentation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Hierarchical summarization for delivering information to mobile devices
Information Processing and Management: an International Journal
Bayesian unsupervised topic segmentation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Text segmentation via topic modeling: an analytical study
Proceedings of the 18th ACM conference on Information and knowledge management
Unsupervised text segmentation using LDA and MCMC
AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Hi-index | 0.00 |
Text segmentation is a fundamental problem in natural language processing, which has application in information retrieval, question answering, and text summarization. Almost previous works on unsupervised text segmentation are based on the assumption of lexical cohesion, which is indicated by relations between words in the two units of text. However, they only take into account the reiteration, which is a category of lexical cohesion, such as word repetition, synonym or superordinate. In this research, we investigate the non-systematic semantic relation, which is classified as collocation in lexical cohesion. This relation holds between two words or phrases in a discourse when they pertain to a particular theme or topic. This relation has been recognized via a topic model, which is, in turn, acquired from a large collection of texts. The experimental results on the public dataset show the advantages of our approach in comparison to the available unsupervised approaches.